SQL:2023 的新玩意

Hacker News 上看到「SQL: 2023 is finished: Here is what's new (eisentraut.org)」這篇題到了 SQL:2023 標準的新東西,對應的原文在「SQL:2023 is finished: Here is what's new」這邊。

「UNIQUE null treatment (F292)」讓你可以決定 NULL 到底要不要算 unique,剛好跟之前寫過的「PostgreSQL 15 將可以對透過 UNIQUE 限制 NULL 的唯一性了」要做的事情一樣。

「ORDER BY in grouped table (F868)」則是針對沒有出現在 SELECT 的欄位頁可以 ORDER BY,看了一下說明,主要是在 JOIN 的時候限制住了。很明顯的 workaround 是多加上這個欄位,但就代表會增加傳回的資料量。

「GREATEST and LEAST (T054)」這個因為 MIN()MAX() 已經被 aggregate function 用掉了,所以只好另外取名。

「String padding functions (T055)」與「Multi-character TRIM functions (T056)」是熟悉的語法,各家都有對應的 function 可以做,但這次就放進標準化。

「Optional string types maximum length (T081)」是 VARCHAR 可以不用指定大小了,實務上應該是還好?

「Enhanced cycle mark values (T133)」這編提到的 recursive 真的是每次用每次忘,然後 cycle 這個功能就沒看懂了...

「ANY_VALUE (T626)」看起來可以隨機取出資料,搭配 GROUP BY '' 就不用拿 ORDER BY RAND() 這種髒髒的東西出來了?

「Non-decimal integer literals (T661)」與「Underscores in numeric literals (T662)」都是讓數字更好讀以及操作。

後面講了很多 JSON 功能,看起來是 SQL:2016 有先納入一些,但 SQL:2023 補的更完整了。

然後有 Graph 相關的標準也被定義進 SQL:2023,原文介紹的也不是很多,看起來是要跨足過來?

MySQL 5.7 的支援只到今年十月 (Oct 2023)

剛剛翻資料才看到 OracleMySQL 5.7 的支援原來只剩下半年了,預定在 2023 年十月中止:「Oracle Technology Products - Oracle Lifetime Support Policy」。

隔壁棚 Percona 包的 Percona Server for MySQL 5.7 可以從「Percona Release Lifecycle Overview」這邊查,看起來也設定一樣的時間 (2023 年十月),但不確定會不會宣佈延長,至少提供 security fix 之類的。

一直沒注意,突然發現只剩下半年...

MongoDB 的替代方案 FerretDB 推出 1.0 (GA) 版本

Hacker News 上看到 FerretDB 推出 1.0 (GA) 版本:「FerretDB: open-source MongoDB alternative (ferretdb.io)」,原文在「Announcing FerretDB 1.0 GA - a truly Open Source MongoDB alternative」這邊。

當初有寫過「MangoDB 改名為 FerretDB (雪貂)」這篇,但沒注意到他們成立公司來開發?在「Careers at FerretDB」這邊可以看到 hiring 的訊息。

官網有整理出目標,像是他們提到不是以 drop-in replacement 為目標,而是實做核心功能與常用的功能,涵蓋大多數的使用者:

Is FerretDB 100% compatible with MongoDB?

It is not necessary, nor it is feasible to implement every single MongoDB feature out there. Our aim is to cover the core feature set of MongoDB, and then continue adding features which could enhance the experience or increase application compatibility. Non-OSS alernatives of MongoDB are similar in this sense, eg. none of these products are able to provide the full feature set of MongoDB. We are aiming to please 85% of MongoDB users, not all of them.

但這樣也讓想換的人會有一些顧慮... 而且這邊的 85% 不知道是怎麼喊出來的?

把 RabbitMQ 換成 PostgreSQL 的那篇文章...

Hacker News 上看到「SQL Maxis: Why We Ditched RabbitMQ and Replaced It with a Postgres Queue (prequel.co)」這篇文章,原文在「SQL Maxis: Why We Ditched RabbitMQ And Replaced It With A Postgres Queue」這邊,裡面在講他們把 RabbitMQ 換成 PostgreSQL 的前因後果。

文章裡面可以吐嘈的點其實蠻多的,而且在 Hacker News 上也有被點出來,像是有人就有提到他們遇到了 bug (或是 feature) 卻不解決 bug,而是決定直接改寫成用 PostgreSQL 來解決,其實很怪:

In summary -- their RabbitMQ consumer library and config is broken in that their consumers are fetching additional messages when they shouldn't. I've never seen this in years of dealing with RabbitMQ. This caused a cascading failure in that consumers were unable to grab messages, rightfully, when only one of the messages was manually ack'ed. Fixing this one fetch issue with their consumer would have fixed the entire problem. Switching to pg probably caused them to rewrite their message fetching code, which probably fixed the underlying issue.

另外一個吐嘈的點是量的部份,如果就這樣的量,用 PostgreSQL 降低使用的 tech stack 應該是個不錯的決定 (但另外一個問題就是,當初為什麼要導入 RabbitMQ...):

>To make all of this run smoothly, we enqueue and dequeue thousands of jobs every day.

If you your needs aren't that expensive, and you don't anticipate growing a ton, then it's probably a smart technical decision to minimize your operational stack. Assuming 10k/jobs a day, thats roughly 7 jobs per minute. Even the most unoptimized database should be able to handle this.

在同一個 thread 下面也有人提到這個量真的很小,甚至直接不講武德提到可以用 Jenkins 解 XD:

Years of being bullshitted have taught me to instantly distrust anyone who is telling me about how many things they do per day. Jobs or customers per day is something to tell you banker, or investors. For tech people it’s per second, per minute, maybe per hour, or self aggrandizement.

A million requests a day sounds really impressive, but it’s 12req/s which is not a lot. I had a project that needed 100 req/s ages ago. That was considered a reasonably complex problem but not world class, and only because C10k was an open problem. Now you could do that with a single 8xlarge. You don’t even need a cluster.

10k tasks a day is 7 per minute. You could do that with Jenkins.

然後意外看到 Simon Willison 提到了一個重點,就是 RabbitMQ 到現在還是不支援 ACID 等級的 job queuing (尤其是 Durability 的部份),也就是希望 MQ 系統回報成功收到的 task 一定會被處理:

The best thing about using PostgreSQL for a queue is that you can benefit from transactions: only queue a job if the related data is 100% guaranteed to have been written to the database, in such a way that it's not possible for the queue entry not to be written.

Brandur wrote a great piece about a related pattern here: https://brandur.org/job-drain

He recommends using a transactional "staging" queue in your database which is then written out to your actual queue by a separate process.

這也是當年為什麼用 MySQL 幹類似的事情,要 ACID 的特性來確保內容不會掉。

這也是目前我覺得唯一還需要用 RDBMS 當 queue backend 的地方,但原文公司的想法就很迷,遇到 library bug 後決定換架構,而不是想辦法解 bug,還很開心的寫一篇文章來宣傳...

Amazon DocumentDB 支援 MongoDB 5.0 協定

看到「Amazon DocumentDB (with MongoDB compatibility) adds support for MongoDB 5.0 wire protocol and client-side field level encryption」這篇,Amazon DocumentDB 推出了 MongoDB 5.0 的支援。

MongoDB 5.0 是 2021 年七月的版本,對使用者端比較重要的看起來是 PII 處理與 TSDB 的支援。

比較感興趣的是,TSDB 的部份是繼續用 PostgreSQL 魔改嗎...?

Etsy 使用 Vitess 的過程

Etsy 寫了三偏關於使用 Vitess 解決資料庫效能問題的文章:「Scaling Etsy Payments with Vitess: Part 1 – The Data Model」、「Scaling Etsy Payments with Vitess: Part 2 – The “Seamless” Migration」、「Scaling Etsy Payments with Vitess: Part 3 – Reducing Cutover Risk」。

Vitess 是 YouTube 團隊開發出來的東西,試著透過一層 proxy 解決後端 MySQL 資料庫在 sharding 後查詢邏輯的問題。

有一些地方的資訊整理出來:

首先是現代暴力解的能耐,從維基百科可以查到 Etsy 在 2015 年就上市了,但到了 2020 年年底撞到 vertically scaling 的天花板 (這邊是指 GCP 的上限),可以看到現在的暴力法可以撐超久... 如果再多考慮到實體機房的話應該可以找到更大台的機器。

第二個是 Etsy 在 2020 年年底開始從資料庫搬資料,一路到 2022 年五月,算起來差不多搬了一年半,總共轉移了 4 個 database 到 Vitess 的 cluster 上,共 23 張表格與 40B rows。

第三個是利用 Vindexes 這個技術降低 sharding 時所帶來的限制。這個之前沒研究過:

A Vindex provides a way to map a column value to a keyspace ID.

從「Older Version Docs」這邊翻舊版的文件,發現 5.0+ 都有,再往 GitHub 上面的資料翻,看起來從 2016 年的版本就有了,不過當時看起來還一直在擴充:「Vitess v2.0.0-rc.1」。

回來看現在的功能,有 primary vindex 的設計:

The Primary Vindex for a table is analogous to a database primary key. Every sharded table must have one defined. A Primary Vindex must be unique: given an input value, it must produce a single keyspace ID.

然後是 secondary vindex(es) 的設計,指到 keyspace id(s),然後這個資訊會被用在 routing 上:

Secondary Vindexes are additional vindexes against other columns of a table offering optimizations for WHERE clauses that do not use the Primary Vindex. Secondary Vindexes return a single or a limited set of keyspace IDs which will allow VTGate to only target shards where the relevant data is present. In the absence of a Secondary Vindex, VTGate would have to send the query to all shards (called a scatter query).

It is important to note that Secondary Vindexes are only used for making routing decisions. The underlying database shards will most likely need traditional indexes on those same columns, to allow efficient retrieval from the table on the underlying MySQL instances.

然後是 functional vindex 與 lookup vindex,前者用演算法定義 keyspace id,後者讓你查:

A Functional Vindex is a vindex where the column value to keyspace ID mapping is pre-established, typically through an algorithmic function. In contrast, a Lookup Vindex is a vindex that provides the ability to create an association between a value and a keyspace ID, and recall it later when needed. Lookup Vindexes are sometimes also informally referred to as cross-shard indexes.

然後 lookup vindex 還有對 consistent hashing 的支援:

Consistent lookup vindexes use an alternate approach that makes use of careful locking and transaction sequences to guarantee consistency without using 2PC. This gives the best of both worlds, with the benefit of a consistent cross-shard vindex without paying the price of 2PC. To read more about what makes a consistent lookup vindex different from a standard lookup vindex read our consistent lookup vindexes design documentation.

這樣整體看起來,Vitess 把所有常見的 sharding 方式都包進去了,如果以後真的遇到這個量的話,也不需要自己在 application 或是 library 做一堆事情了...

Pony ORM

Simon Willison 的 blog 上看到的東西:「Python’s “Disappointing” Superpowers」,裡面提到的原文是「Python’s “Disappointing” Superpowers」這篇,在講 Python 的工具。

雖然是說「disappointing」,但實際上是反義,在原文裡面提到了很多特別的工具,其中 Pony ORM 算是我覺得最有趣的了,他的寫法就非常的 Python:

select(c for c in Customer if sum(c.orders.price) > 1000)

也可以用 lambda 的形式來寫:

Customer.select(lambda c: sum(c.orders.total_price) > 1000)

這樣會產生出對應的 SQL:

SELECT "c"."id"
FROM "customer" "c"
  LEFT JOIN "order" "order-1"
    ON "c"."id" = "order-1"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000

不會產生 syntax error 的原因是因為他直接解讀 bytecode 分析,產生出對應的 SQL query:

A normal understanding of generator expressions suggests that the select function is consuming a generator. But that couldn’t explain the behaviour here. Instead, it actually introspects the frame object of the calling code, then decompiles the byte code of the generator expression object it finds, and builds a Query based on the AST objects.

用這樣的設計來達到語法的自由度。

看了一下也有一些 integration,像是 Flask 的「Integration with flask」與 FastAPI 的「Integration with FastAPI」。

不過應該是先看看,目前 Python 上用的主力還是 Django,有自己的 ORM 架構...

SQLite 的 HC-tree 計畫

Hacker News 首頁上看到的新計畫:「HC-tree is an experimental high-concurrency database back end for SQLite (sqlite.org)」,SQLite 弄了一個實驗性質的 backend,叫做 HC-tree

The HC-tree (hctree) project is an attempt to develop a new database backend that improves upon regular SQLite as follows:

他列了幾個重點,其中「Improved concurrency」這點題到了可以讓多個 writer 同時寫入運作,這點算是 SQLite 很大的改變,目前希望可以做到在 single-threaded 情況下不輸現有的 SQLite:

An implicit goal is that hctree must be as fast or faster than stock SQLite for all single-threaded cases. There is no point in running dozens of concurrent writers if each of them is an order of magnitude slower than a single writer writing to a legacy database.

另外一方面,這算是 SQLite 真正要面對資料庫的 isolation 的問題了,比起現在的版本,同時間從只有一個 writer 的架構要變成支援多個 writer 的架構,所以在「Concurrency Model」這邊也帶了一下他預期可以做到的事情。

然後這邊可以看到在解釋裡面有提到 table 與 index 還是 b-tree,這樣應該可以猜測 hctree 的實做方式應該還是在市場上已經很成熟的 MVCC 那套方法:

If no other client has modified any b-tree (table or index) entry or range that the transaction being committed accessed by a range or stabbing query, then the transaction is valid.

另外一個蠻大的改變是「Support for replication」,現有的 SQLite 可以透過 extension 的方式加掛支援 replication 功能,現在則是讓底層的 backend 直接支援。

底層支援新的 backend 以後看起來會有不少變化可以玩,第一個想到的當然是變成 server 類型的服務,也就是像 MySQL 或是 PostgreSQL 這樣的方式,另外一種方向是包裝成 distributed database,讓應用程式可以簡單跨機器使用。

目前還不知道會往什麼方向走就是了,也有可能 SQLite 這邊只實做 backend,上面讓大家發揮...

資料庫在 2022 年的發展

Hacker News 上看到這篇對 2022 年資料庫發展的回顧文章,可以補一些沒看到的新聞與發展:「Databases in 2022: A Year in Review」,作者 Andy PavloOtterTune 的創辦人,有些他的想法會有些偏見,就當作一方之言來看就好。另外在 Hacker News 上的討論則是冒出一堆創辦人出來替自己公司的產品介紹一番:「Databases in 2022: A Year in Review (ottertune.com)」。

首先是他提到的 ClickHouse,我是 2022 年才開始關注,而且發現很不賴。查了一下維基百科,在 2021 年十月的時候搞了 round B $250m,估值 $2b:

On October 28, 2021 the company received Series B funding totaling $250 million at an valuation of $2 billion from Coatue Management, Altimeter Capital, and other investors.

另外是 Meta 弄出來的 Velox,直接看官網上面的圖可以看到,他試著把 Database、PrestoSpark 的 engine/worker 層抽換掉:

GitHub 的頁面說明上也可以看出來,Velox 是提供很多已經最佳化過的界面 (包括了 Type、Vector、Expression Eval、Function Packages、Operators、I/O、Network Serializers 與 Resource Management) 讓 engine/worker 直接使用,避免了自己的實做沒有最佳化:

Velox is a C++ database acceleration library which provides reusable, extensible, and high-performance data processing components.

In common usage scenarios, Velox takes a fully optimized query plan as input and performs the described computation. Considering Velox does not provide a SQL parser, a dataframe layer, or a query optimizer, it is usually not meant to be used directly by end-users; rather, it is mostly used by developers integrating and optimizing their compute engines.

其他的消息就看過去有個印象...

WordPress 打算要支援 SQLite 作為後端資料庫

目前 WordPress 只有支援 MySQL,而昨天在 Hacker News 上看到 WordPress 有打算要支援 SQLite 作為後端資料庫的消息:「WordPress testing official SQLite Support (github.com/wordpress)」,原文在 GitHub 上:「Implement new experimental SQLite integration module」。

理論上對使用者會更方便,但對 extension 開發者會麻煩一些 (或是直接標不支援?),尤其是用到 MySQL 特有的語法就要注意了。

實質上 PHP + MySQL hosting 其實蠻常見的,這個作法有多少幫助就不知道了。

但突然想到,如果做一個 read-only 版本的 WordPress 站台,然後把 SQLite 的讀取部份改用 sql.js 之類的計畫,再把一堆 server side rendering 的部份變成 client side rendering,好像有機會可以整包直接上 GitHub Pages 之類的服務?雖然這樣有點拖褲子放屁...