PostgreSQL 的 Fuzzy Matching

在「Fuzzy Name Matching in Postgres」這邊看到 PostgreSQL 下怎麼設計 Fuzzy Matching 的方式,文章裡用的方法主要是出自 PostgreSQL 的文件:「F.15. fuzzystrmatch」。

文章最後的解法是 Soundex + Levenshtein

翻了一下資料,這個領域另外有 NYSIIS (New York State Identification and Intelligence System):

The New York State Identification and Intelligence System Phonetic Code, commonly known as NYSIIS, is a phonetic algorithm devised in 1970 as part of the New York State Identification and Intelligence System (now a part of the New York State Division of Criminal Justice Services). It features an accuracy increase of 2.7% over the traditional Soundex algorithm.

以及 Metaphone

Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation. It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar. As with Soundex, similar-sounding words should share the same keys. Metaphone is available as a built-in operator in a number of systems.


PostgreSQL 的 scale 建議

Hacker News Daily 上看到「Postgres scaling advice for 2021」這篇,講 PostgreSQL 要怎麼 scale,在 Hacker News 上也有對應的討論可以看:「Postgres scaling advice (」。

文章前面先提到分散式系統的複雜度會導致 RDBMS 上的一些假設失效,所以如果可以用單台機器暴力解,就儘量用單台機器來解 (scale up 的情境),裡面就提到了一些「暴力可以解決很多問題」的說明,差不多就是前幾天提到的「Let's Encrypt 升級資料庫伺服器 (AMD YES?)」。

後面提到如果真的要放進分散式的 RDBMS (scale out 的情境),怎麼設計資料結構會比較好。

這邊剛好也可以提一下,量夠大的時候要把 OLTPOLAP 的應用分開,現在有很多 OLAP 資料庫可以選擇,同步的工具也很成熟了,通常效能會比在 OLTP 上面硬跑來的好。

最後提一下,文章裡面對於 transaction per second 可以拉很高,有些假設沒有明寫出來。這需要盡可能把 transaction 拆小,避免常常有 giant transaction 卡住整個資料庫,這點對於一般的系統會需要做不少改寫...

不過最後比較疑惑的是,這種文章怎麼會上 Hacker News 的啊...

用 GPT-3 直接產生對應的 SQL query

Hacker News Daily 上看到的東西,直接給 GPT-3 條件,叫 GPT-3 給出對應的 SQL query,這就有點誇張了...:「Automating My Job with GPT-3」。


Instruction: Given an input question, respond with syntactically correct PostgreSQL. Be creative but the SQL must be correct.

Input: how many users signed up in the past month?


GPT-3 Response: SELECT COUNT(*) FROM users
WHERE signup_time > now() - interval '1 month'

把欄位名稱替換掉就可以用了... 而且接下來就更誇張了,直接要求 GPT-3 也要照著設定的 table schema 給答案:

Instruction: Given an input question, respond with syntactically correct PostgreSQL. Be creative but the SQL must be correct. Only use tables called "users" and "charges". The "users" table has columns: id (integer), signup_dt (timestamp), email (character varying), and plan_type (character varying). The "charges" table has columns: amount (bigint), user_id (integer), and charge_dt (timestamp).

Input: how much revenue did we have in the past 7 days?


GPT-3 Response: SELECT SUM(amount) FROM charges WHERE charge_dt > now() - interval '7 days'

接下來是在同樣 instruction 下,跨表格的問題:

Input: how much revenue have we had from users that signed up in the last 6 months?

這時候 INNER JOIN 就跑出來了:

.8 Temperature GPT-3 Response: SELECT SUM(charges.amount) FROM users INNER JOIN charges ON = charges.user_id WHERE signup_dt >= DATE_SUB(now(), INTERVAL '6 months')

後面的問題也很精彩,看起來之後可以接上 BI dashboard,直接丟句子進去,然後拉各種資料出來視覺化?

Eventbrite 的 MySQL 升級計畫

在 2021 年看到 EventbiteMySQL 升級計畫:「MySQL High Availability at Eventbrite」。

看起來是 2019 年年初的時候 MySQL 5.1 出問題,後續決定安排升級,在 2019 年年中把系統升級到 MySQL 5.7 (Percona Server 版本):

Our first major hurdle was to get current with our version of MySQL. In July, 2019 we completed the MySQL 5.1 to MySQL 5.7 (v5.7.19-17-log Percona Server to be precise) upgrade across all MySQL instances.

然後看起來是直接在 EC2 上跑,不過這邊提到的空間問題就不太確定了,是真的把 EBS 的空間上限用完嗎?比較常使用的 gp2gp3 上限都是 16TB,不確定是不是真的用到接近爆掉了:

Not only was support for MySQL 5.1 at End-of-Life (more than 5 years ago) but our MySQL 5.1 instances on EC2/AWS had limited storage and we were scheduled to run out of space at the end of July. Our backs were up against the wall and we had to deliver!

另外在升級到 5.7 的時候,順便把本來是 INT 的 primary key 都換成 BIGINT

As part of the cut-over to MySQL 5.7, we also took the opportunity to bake in a number of improvements. We converted all primary key columns from INT to BIGINT to prevent hitting MAX value.

然後系統因為舊版的 Django 沒辦法配合 MySQL 5.7,得升級到 Django 1.6 (要注意 Django 1 系列的最新版是 1.11,看起來光是升級到 1.6 勉強會動就升不上去了?):

In parallel with the MySQL 5.7 upgrade we also Upgraded Django to 1.6 due a behavioral change in MySQL 5.7 related to how transactions/commits were handled for SELECT statements. This behavior change was resulting in errors with older version of Python/Django running on MySQL 5.7

然後採用了 GitHub 家研發的 gh-ost 當作改變 schema 的工具:

In December 2019, the Eventbrite DBRE successfully implemented a table ALTER via gh-ost on one of our larger MySQL tables.

看起來主要的原因是有遇到 pt-online-schema-change 的限制 (在「GitHub 發展出來的 ALTER TABLE 方式」這邊有提到):

Eventbrite had traditionally used pt-online-schema-change (pt-osc) to ALTER MySQL tables in production. pt-osc uses MySQL triggers to move data from the original to the “duplicate” table which is a very expensive operation and can cause replication lag. Matter of fact, it had directly resulted in several outages in H1 of 2019 due to replication lag or breakage.

另外一個引入的技術是 Orchestrator,看起來是先跟 HAProxy 搭配,不過他們打算要再換到 ProxySQL

Next on the list was implementing improvements to MySQL high availability and automatic failover using Orchestrator. In February of 2020 we implemented a new HAProxy layer in front of all DB clusters and we released Orchestrator to production!

Orchestrator can successfully detect the primary failure and promote a new primary. The goal was to implement Orchestrator with HAProxy first and then eventually move to Orchestrator with ProxySQL.

然後最後題到了 Square 研發的 Shift,把 gh-ost 包裝起來變成有個 web UI 可以操作:

2021 還可以看到這類文章還蠻有趣的...

產生名次的 SQL

Percona 的「Generating Numeric Sequences in MySQL」這篇在討論產生字串序列,主要是在 MySQL 環境下,裡面看到的技巧「Session Variable Increment Within a SELECT」這組,剛好可以用在要在每個 row 裡面增加名次:

SELECT (@val := @val + 1) - 1 AS value FROM t1, (SELECT @val := 0) AS tt;

另外看到 MariaDBMySQL 8.0 系列因為有多支援各種功能,剛好也可以被拿來用,然後最後也提到了 Percona 自家出的 MySQL 8.0.20-11 將會直接有 SEQUENCE_TABLE() 可以用 (這應該才是 Percona 這篇文章的主要目的,推銷一下自家產品的新功能)。


Amazon DocumentDB 推出相容 MongoDB 4.0 的版本

在「Amazon DocumentDB (with MongoDB compatibility) adds support for MongoDB 4.0 and transactions」這邊看到 AWSAmazon DocumentDB 上推出相容 MongoDB 4.0 的版本。

把年初在 Ptt 上寫的「Re: [請益] 選擇mongoDB或是relational database ??」這篇拿出來講一下,MongoDB 4.0 最大的改進就是 multi-document transactions 了。

不過 AWS 先前推出 DocumentDB (MongoDB) 時看到的限制,大家都猜測是用 PostgreSQL 當底層 (「AWS 推出 MongoDB 服務:Amazon DocumentDB」與「大家在猜 Amazon DocumentDB 的底層是不是 PostgreSQL...」),雖然目前還是不太清楚,但如果這個猜測屬實的話,要推出各種 transaction 的支援完全不是問題 XDDD

Percona 對 MongoDB 的建議

看到「5 Things DBAs Should Know Before Deploying MongoDB」這篇,裡面給了五個建議,其中第五點頗有趣:

5) Whenever Possible, Working Set < RAM

As with any database, fitting your data into RAM will allow for faster reads than from disk. MongoDB is no different. Knowing how much data MongoDB has to read in for your queries can help you determine how much RAM you should allocate to your database.

這樣的設計邏輯很奇怪啊,你不要扯其他 database 啊,你們家主力的 InnoDB 一直都沒有推薦要 Working Set < RAM 啊,反過來才是用 InnoDB 的常態吧,而且在 PostgreSQL 上也是這樣吧 XDDD

現在上面的文章真的是挑著看了... XD

RDS 推出 ARM 版本

Amazon RDS 推出了 ARM 的版本:「New – Amazon RDS on Graviton2 Processors」,包含了 MySQLMariaDBPostgreSQL 的版本都有支援,不過看起來需要比較新版的才能用:

You can choose between M6g and R6g instance families and three database engines (MySQL 8.0.17 and higher, MariaDB 10.4.13 and higher, and PostgreSQL 12.3 and higher).

官方宣稱可以提供 35% 的效能提昇,考慮費用的部份會有 52% 的 c/p 值提昇:

Graviton2 instances provide up to 35% performance improvement and up to 52% price-performance improvement for RDS open source databases, based on internal testing of workloads with varying characteristics of compute and memory requirements.

對於 RDS 這種純粹就是個服務的應用來說,感覺應該不會有什麼轉移成本,只要測過沒問題,換過去等於就是現賺的。看起來等 RI 約滿了就可以切...


這篇算是考古文,找出 MySQLTIME 資料型態奇怪範圍的由來:「TIME for a WTF MySQL moment」。

在官方的文件裡面可以看到 TIME 的範圍是個奇怪的數字,如果把各版本的文件都拉出來看,會發現都沒改過:「11.2.3 The TIME Type (8.0)」、「11.2.3 The TIME Type (5.7)」、「11.2.3 The TIME Type (5.6)」,「11.3.2 The TIME Type (5.5,靠 Internet Archive 的存檔頁面)」、「11.3.2 The TIME Type (5.1,靠 Internet Archive 的存檔頁面)」、「11.3.2 The TIME Type (5.0,靠 Internet Archive 的存檔頁面)」,裡面一直都是:

TIME values may range from '-838:59:59' to '838:59:59'.

這個數字看起來應該是某個限制,但作者粗粗算了幾種可能都不像,所以就一路考古,發現算是在 MySQL 3 年代因為某個特別公式留下來的遺毒,就一路用到現在了:

One of the bits was used for the sign as well, but the remaining 23 bits were an integer value produced like this: Hours × 10000 + Minutes × 100 + Seconds; in other words, the two least significant decimal digits of the number contained the seconds, the next two contained the minutes, and the remaining ones contained the hours. 223 is 83888608, i.e. 838:86:08, therefore, the maximum valid time in this format is 838:59:59.

話說回來,用 MySQL 的人還是很習慣用 INTBIGINT 來存時間,這樣可以自動遠離這些鳥問題,之前在「MySQL 裡儲存時間的方式...」與「Facebook 在 MySQL 裡存時間的型態」這邊都寫過...

不過最近用 PostgreSQL 比較多,可以比較「正常」的使用各種資料型態...

PostgreSQL 13 的 B-Tree Deduplication

Hacker News 上看到「Lessons Learned from Running Postgres 13: Better Performance, Monitoring & More」這篇文章,其中有提到 PostgreSQL 13 因為 B-Tree 支援 deduplication,所以有機會縮小不少空間。

搜了一下源頭是「Add deduplication to nbtree.」這個 git commit,而 PostgreSQL 官方的說明則是在「63.4.2. Deduplication」這邊可以看到。

另外值得一提的是,這個功能在 CREATE INDEX 這頁可以看到在 PostgreSQL 13 預設會打開使用。

依照說明,看起來本來的機制是當 B-Tree index 內的 key 相同時,像是 key1 = key2 = key3 這樣,他會存 {key1, ptr1}{key2, ptr2}{key3, ptr3}

在新的架構下開啟 deduplication 後就會變成類似 {key1, [ptr1, ptr2, ptr3]} 這樣的結構。可以看出來在 key 重複的資料很多的時候,可以省下大量空間 (以術語來說的話,就是 cardinality 偏低的時候)。