Pinterest 對 InnoDB 壓縮的改善

三個月前 Pinterest 提到對 InnoDB 壓縮的改善,講到透過字典的改善方式:「Pinterest 在 InnoDB Compression 的努力」。

而在「Evolving MySQL Compression - Part 2」這邊繼續說明要怎麼生出對 Pinterest 比較有效的字典內容,作者把計算的工具放到 GitHub 上讓其他人可以用 (用 Python 寫的):「pinterest/mysql_utils/zdict_gen/」。

可以看出來又增加不少壓縮率,這算是針對資料庫壓縮從 A 到 A+ 的行為吧...

常見密碼表

先前在「NIST 新的密碼規範」這邊提到了用字典檔避免使用者選擇弱密碼的問題:

When processing requests to establish and change memorized secrets, verifiers SHOULD compare the prospective secrets against a dictionary of known commonly-used and/or compromised values. This list SHOULD include passwords from previous breach corpuses, as well as dictionary words and specific words (such as the name of the service itself) that users are likely to choose. If the chosen secret is found in the dictionary, the subscriber SHOULD be required to choose a different value. The subscriber SHOULD be advised that they need to select a different secret because their previous choice was commonly used.

除了一般的字典檔以外,還要從之前被破的網站取得。這部份的資料可以從 danielmiessler/SecLists 這邊的 Passwords 目錄下取得,資料不算太多,但應該夠用。

教育部三本字辭典改用 CC BY-ND 3.0 TW 授權

剛剛看到的消息,教育部國語辭典公眾授權網採用 CC BY-ND 3.0 TW 授權,將《重編國語辭典修訂本》、《國語辭典簡編本》、《國語小字典》三本字辭典公開授權,並且提供結構化的資料下載:

ND 有點可惜啊,不過是一大步了...