Google 提供掃描 JAR 檔內是否有中獎的 Log4j 的工具

Hacker News 首頁上看到 Google 提供了一套用 Golang 寫的工具,可以掃描 JAR 檔裡面是否有中獎的 Log4j:「log4jscanner」,對應的討論在「Log4jscanner (」這邊。

看起來是內部工具,放出來前先把 vcs history 清掉了:

We unfortunately had to squash the history when open sourcing. The following contributors were instrumental in this project's development: [...]

另外討論裡也有人提到「OWASP Dependency-Check」這個工具也可以掃,這套就更一般性了:

Dependency-check automatically updates itself using the NVD Data Feeds hosted by NIST.

阻擋網站透過瀏覽器掃 localhost

五月的時候,DuckDuckGoCharlie Belmer 發了一篇關於網站透過瀏覽器掃 localhost 的文章,引起了不少重視:「Why is This Website Port Scanning me?」。

這個月陸陸續續看到一些反制方式了,比較簡單的是透過像 uBlock Origin 這類可以擋特定 url 的方式,像是 EasyPrivacy 裡面把一些大站台的 javascript script 擋下來:「uBlock Origin ad blocker now blocks port scans on most sites」。

在同一篇文章的 comment 處也有人提到 uBlock Origin 可以做的更廣泛:「Block access to and LAN address from the internet #4318」,裡面有人已經整理好丟出來了:「lan-block.txt」,看起來也可以擋一些...

要擋得比較完整的還得考慮 IN A 這種方式繞過去的情況?這可能需要用 extension 了...


看到「falsisign」這個專案 (FalsiScan: Make it look like a PDF has been hand signed and scanned),完全符合這個 blog 的副標題「幹壞事是進步最大的原動力」的精神,不介紹一下好像說不過去...:

For bureaucratic reasons, a colleague of mine had to print, sign, scan and send by email a high number of pages. To save trees, ink, time, and to stick it to the bureaucrats, I wrote this script.

把「印出來簽名再寄掃描回去」這種事情在電子系統上全自動化:產生出來的 PDF 會把預先嵌好的簽名貼到程式指定的位置上,另外還會稍微把把紙張轉一些角度,並且加上影印時會產生的黑邊...


Google 推出的 PhotoScan,利用多張照片去除反光

Google 推出的 PhotoScan (AndroidiOS) 可以利用多次不同角度的拍攝來消除反光:「PhotoScan: Taking Glare-Free Pictures of Pictures」。



MIT Media Lab 弄出個好玩的東西,可以不打開書直接掃描書的內容:「Can computers read through a book page by page without opening it?」,主標題是「Terahertz time-gated spectral imaging for content extraction through layered structures」。

用 100Ghz 到 3Thz 的電磁波掃描:

In our new study we explore a range of frequencies from 100 Gigahertz to 3 Terahertz (THz) which can penetrate through paper and many other materials.

先前也有類似的方法,用 X-ray 或是超音波,但效果都不好:

Can’t X-ray or ultrasound do this? It may seem that X-ray or ultrasound can also image through a book; however, such techniques lack the contrast of our THz approach for submicron pen or pencil layers compared next to blank paper. These methods have additional drawbacks like cost and ionizing radiation. So while you might be able to hardly detect pages of a closed book if you use a CT scan, you will not be able to see the text. Ultrasound does not have the resolution to detect 20 micron gaps in between the pages of a closed book -distinguishing the ink layers from the blank paper is out of the question for ultrasound. Based on the paper absorption spectrum, we believe that far infrared time resolved systems and THz time domain systems might be the only suitable candidates for investigating paper stacks page by page.


PostgreSQL 對 Vacuum 效能的改善

在「No More Full-Table Vacuums」這邊提到了 PostgreSQL 在 vacuum 時效能的大幅改善,尤其是大型資料庫在 vacuum 時需要對整個表格從頭到尾掃一次以確保 transaction id 的正確性:

Current releases of PostgreSQL need to read every page in the database at least once every 2 billion write transactions (less, with default settings) to verify that there are no old transaction IDs on that page which require "freezing".


All of a sudden, when the number of transaction IDs that have been consumed crosses some threshold, autovacuum begins processing one or more tables, reading every page. This consumes much more I/O bandwidth, and exerts much more cache pressure on the system, than a standard vacuum, which reads only recently-modified page.

而作者送了 patch 改成只會讀還沒搞定的部份:

Instead of whole-table vacuums, we now have aggressive vacuums, which will read every page in the table that isn't already known to be entirely frozen.

要注意的是,agreesive vacuum 相較於 vacuum 會多吃很多資源,但可以打散掉 (有點像一次大 GC 導致 lag 與多次 minor GC 讓程式反應時間變得比較順暢的比較):

An aggressive vacuum still figures to read more data than a regular vacuum, possibly a lot more. But at least it won't read the data that hasn't been touched since the last aggressive vacuum, and that's a big improvement.

這個功能預定在 PostgreSQL 9.6 出現,不知道會不會變 default...

PostgreSQL 9.5 將會有 Parallel Sequential Scan

在「Parallel Sequential Scan is Committed!」這邊看到 PostgreSQL 9.5 (還沒出) 將會有 Parallel Sequential Scan 的功能。

文章的作者直接拿了一個大家超常用的惡搞來示範,也就是經典的 LIKE '%word%'

rhaas=# \timing
Timing is on.
rhaas=# select * from pgbench_accounts where filler like '%a%';
 aid | bid | abalance | filler
(0 rows)

Time: 743.061 ms
rhaas=# set max_parallel_degree = 4;
Time: 0.270 ms
rhaas=# select * from pgbench_accounts where filler like '%a%';
 aid | bid | abalance | filler
(0 rows)

Time: 213.412 ms

這功能真不錯 XD

Google 的書本掃描服務被認定為「合理使用」

Google 的書本掃描服務被認定為合理使用:「Google's Book-Scanning Project Ruled to Be Legal `Fair Use'」。

“Google’s unauthorized digitizing of copyright-protected works, creation of a search functionality and display of snippets from those works are non-infringing fair uses,” U.S. Circuit Judge Pierre Leval wrote on behalf of the court. “The purpose of the copying is highly transformative, the public display of text is limited and the revelations do not provide a significant market substitute for the protected aspects of the originals.”


用 CipherScan 在 command line 下檢查系統

在「SSL/TLS for the Pragmatic」這篇裡面提到了 CipherScan 這個工具,用起來很簡單而且輸出很清楚。

直接 git clone 下來後執行就可以了,另外因為檢測 ChaCha20+Poly1305 需要新版 OpenSSL (1.0.2 才有,目前還是開發版),所以 clone 下來的時候裡面包括了一個 Linux 版的 openssl,砍掉的話他會用系統的 openssl。

像是我的 blog 就可以掃出這樣的結果:

gslin@home [~/git/cipherscan] [17:57/W4] (master) ./cipherscan

prio  ciphersuite                  protocols              pfs_keysize
1     DHE-RSA-AES256-GCM-SHA384    TLSv1.2                DH,2048bits
2     ECDHE-RSA-AES256-GCM-SHA384  TLSv1.2                ECDH,P-256,256bits
3     ECDHE-RSA-AES256-SHA384      TLSv1.2                ECDH,P-256,256bits
4     ECDHE-RSA-AES256-SHA         TLSv1,TLSv1.1,TLSv1.2  ECDH,P-256,256bits
5     DHE-RSA-AES256-SHA256        TLSv1.2                DH,2048bits
6     DHE-RSA-AES256-SHA           TLSv1,TLSv1.1,TLSv1.2  DH,2048bits
7     DHE-RSA-CAMELLIA256-SHA      TLSv1,TLSv1.1,TLSv1.2  DH,2048bits
8     AES256-GCM-SHA384            TLSv1.2
9     AES256-SHA256                TLSv1.2
10    AES256-SHA                   TLSv1,TLSv1.1,TLSv1.2
11    CAMELLIA256-SHA              TLSv1,TLSv1.1,TLSv1.2
12    ECDHE-RSA-AES128-GCM-SHA256  TLSv1.2                ECDH,P-256,256bits
13    ECDHE-RSA-AES128-SHA256      TLSv1.2                ECDH,P-256,256bits
14    ECDHE-RSA-AES128-SHA         TLSv1,TLSv1.1,TLSv1.2  ECDH,P-256,256bits
15    DHE-RSA-AES128-GCM-SHA256    TLSv1.2                DH,2048bits
16    DHE-RSA-AES128-SHA256        TLSv1.2                DH,2048bits
17    DHE-RSA-AES128-SHA           TLSv1,TLSv1.1,TLSv1.2  DH,2048bits
18    DHE-RSA-CAMELLIA128-SHA      TLSv1,TLSv1.1,TLSv1.2  DH,2048bits
19    AES128-GCM-SHA256            TLSv1.2
20    AES128-SHA256                TLSv1.2
21    AES128-SHA                   TLSv1,TLSv1.1,TLSv1.2
22    CAMELLIA128-SHA              TLSv1,TLSv1.1,TLSv1.2
23    DES-CBC3-SHA                 TLSv1,TLSv1.1,TLSv1.2

Certificate: trusted, 2048 bit, sha256WithRSAEncryption signature
TLS ticket lifetime hint: 600
OCSP stapling: supported
Server side cipher ordering

Facebook 的 InnoDB patch 讓 table scan 速度變快...

Facebook 的 Database Engineering team 實作了 patch,讓 InnoDB 在 table scan 的速度大幅提昇:「Making full table scan 10x faster in InnoDB」。

第一個 patch 叫做 Logical Readahead。第二個 patch 是針對 async i/o 的改善 (Submitting multiple async I/O requests at once)。

引用文章內的幾段話就知道這幾個 patch 的功力了:

Logical backup size is much smaller. 3x-10x size difference is not uncommon.

備份出來的資料會變小,而且宣稱 1/3 到 1/10 不是罕見情況... -_-

With logical readahead, our full table scan speed improved 9~10 times than before under usual production workloads. Under heavy production workloads, full table scan speed became 15~20 times faster.

然後 table scan 的速度會快非常多... 10 倍?如果是平常就很操的 database 會更明顯?

如果這幾個 patch 如果沒有什麼問題,可以預期會被 merge 到 PerconaMariaDB,至於 Oracle 官方的 source tree... 有的話當然很好,沒有的話也很正常?