把 RabbitMQ 換成 PostgreSQL 的那篇文章...

Hacker News 上看到「SQL Maxis: Why We Ditched RabbitMQ and Replaced It with a Postgres Queue (prequel.co)」這篇文章,原文在「SQL Maxis: Why We Ditched RabbitMQ And Replaced It With A Postgres Queue」這邊,裡面在講他們把 RabbitMQ 換成 PostgreSQL 的前因後果。

文章裡面可以吐嘈的點其實蠻多的,而且在 Hacker News 上也有被點出來,像是有人就有提到他們遇到了 bug (或是 feature) 卻不解決 bug,而是決定直接改寫成用 PostgreSQL 來解決,其實很怪:

In summary -- their RabbitMQ consumer library and config is broken in that their consumers are fetching additional messages when they shouldn't. I've never seen this in years of dealing with RabbitMQ. This caused a cascading failure in that consumers were unable to grab messages, rightfully, when only one of the messages was manually ack'ed. Fixing this one fetch issue with their consumer would have fixed the entire problem. Switching to pg probably caused them to rewrite their message fetching code, which probably fixed the underlying issue.

另外一個吐嘈的點是量的部份,如果就這樣的量,用 PostgreSQL 降低使用的 tech stack 應該是個不錯的決定 (但另外一個問題就是,當初為什麼要導入 RabbitMQ...):

>To make all of this run smoothly, we enqueue and dequeue thousands of jobs every day.

If you your needs aren't that expensive, and you don't anticipate growing a ton, then it's probably a smart technical decision to minimize your operational stack. Assuming 10k/jobs a day, thats roughly 7 jobs per minute. Even the most unoptimized database should be able to handle this.

在同一個 thread 下面也有人提到這個量真的很小,甚至直接不講武德提到可以用 Jenkins 解 XD:

Years of being bullshitted have taught me to instantly distrust anyone who is telling me about how many things they do per day. Jobs or customers per day is something to tell you banker, or investors. For tech people it’s per second, per minute, maybe per hour, or self aggrandizement.

A million requests a day sounds really impressive, but it’s 12req/s which is not a lot. I had a project that needed 100 req/s ages ago. That was considered a reasonably complex problem but not world class, and only because C10k was an open problem. Now you could do that with a single 8xlarge. You don’t even need a cluster.

10k tasks a day is 7 per minute. You could do that with Jenkins.

然後意外看到 Simon Willison 提到了一個重點,就是 RabbitMQ 到現在還是不支援 ACID 等級的 job queuing (尤其是 Durability 的部份),也就是希望 MQ 系統回報成功收到的 task 一定會被處理:

The best thing about using PostgreSQL for a queue is that you can benefit from transactions: only queue a job if the related data is 100% guaranteed to have been written to the database, in such a way that it's not possible for the queue entry not to be written.

Brandur wrote a great piece about a related pattern here: https://brandur.org/job-drain

He recommends using a transactional "staging" queue in your database which is then written out to your actual queue by a separate process.

這也是當年為什麼用 MySQL 幹類似的事情,要 ACID 的特性來確保內容不會掉。

這也是目前我覺得唯一還需要用 RDBMS 當 queue backend 的地方,但原文公司的想法就很迷,遇到 library bug 後決定換架構,而不是想辦法解 bug,還很開心的寫一篇文章來宣傳...

RabbitMQ 也進來搶 Streaming Engine 這塊市場了

RabbitMQ 要在 3.9 版推出 Streams (目前 3.9 還在 RC):「RabbitMQ Streams Overview」,在 Hacker News 上有一些討論可以看:「RabbitMQ Streams Overview (rabbitmq.com)」。

這樣 RabbitMQ 可以同時有 queue 與 stream 的能力,對於一些小專案應該會方便不少,算是打隔壁棚 Kafka 的痛點之一,弄一組 HA cluster 至少要三台 ZooKeeper 加上兩台 Broker,基礎建設還要有對應的 HA load balancer 機制,在 AWS 上可以拿 ELB 偷懶,如果是實體機房的話也許用 F5 擋,或是用 open source 的 Keepalived (或是老一點的 Heartbeat) + nginx

目前還不知道 scalability 的能耐,不過印象中在 queue 的單台 throughput 不差,如果沒有意外的話,streaming 的部份對小型專案應該夠用。

PostgreSQL 的 Job Queue、Application Lock 以及 Pub/Sub

Hacker News Daily 上看到一篇講 PostgreSQL 做 Job Queue、Application Lock 以及 Pub/Sub 的方法:「Do You Really Need Redis? How to Get Away with Just PostgreSQL」,對應的討論在「Do you really need Redis? How to get away with just PostgreSQL (atomicobject.com)」這邊可以翻到。

拿 PostgreSQL 跑這些東西的確有點浪費,不過如果是自己的專案,不想要把 infrastructure 搞的太複雜的話,倒是還不錯。

首先是 Job Queue 的部份,從他的範例看起來他是在做 async job queue (不用等回傳值的),這讓我想到很久前寫的 queue service (應該是 2007 年與 2012 年都寫過一次),不過我是用 MySQL 當作後端,要想辦法降低 InnoDB 的 lock 特性。

async job queue 設計起來其實很多奇怪的眉角,主要就是在怎麼處理失敗的狀態。大多數的需求可以放到兩個種類,最常見用的是 at-least-once,保證最少跑一次,大多數從設計上有設計成 idempotence 的都可以往這類丟,像是報表類的 (重複再跑一次昨天的報表是 OK 的),另外每天更新會員狀態也可以放在這邊。

另外少見一點的是 at-most-once 與 exactly-once,最多只跑一次與只跑一次,通常用在不是 idempotence 的操作上,像是扣款之類的,這邊的機制通常都會跟商業邏輯有關,反正不太好處理...

第二個是 Application Lock,跨機器時的 lock 機制,量沒有很大時拿 PostgreSQL 跑還行,再大就要另外想辦法了,馬上想到的是 ZooKeeper,但近年設計的系統應該更偏向用 etcdConsul 了...

最後提到的 Pub/Sub,一樣是在量大的時候拿 PostgreSQL 跑還行,更大的時候就要拿 Kafka 這種專門為了效能而設計出來的軟體出來用...

Amazon SQS 推出量大折扣...

Amazon SQS 推出了分級計價,不過看了一下「Amazon SQS pricing」這邊的表格,一般的單位大概是不太會遇到:「Amazon SQS announces tiered pricing」。

他的原價級距是 From 1 Million to 100 Billion Requests/Month,算了一下必須一整個月的每秒都有 38580 requests/sec 左右才會到下一個級距,就算內部架構倍數放大,要達到這個數量應該也必須要有一定規模... (或者惡搞?)

隔壁棚的 Amazon SNS 不知道會不會也跟著降,在 AWS 的服務與文件上經常會看到兩個串起來用...

Kafka 拔掉 ZooKeeper 的計畫

目前 Kafka cluster 還是會需要透過 ZooKeeper 處理不少資料,但眾所皆知的,ZooKeeper 實在是不好維護,所以 Kafka 官方從好幾年前就一直在想辦法移除對 ZooKeeper 的相依性。

這篇算是其中一塊:「Kafka Needs No Keeper」。

真的自己架過 Kafka cluster 就會知道其中的 ZooKeeper 很不好維護,尤其是 Apache 官方版本的軟體與文件常常脫勾,設定起來就很痛苦。所以一般都會用 Confluent 出的包裝,裡面的 ZooKeeper 軟體與 Confluent 自己寫的文件至少都被測過,不太會遇到官方文件與軟體之間搭不上的問題。

另外一個常見的痛點是,因為 Kafka 推動拔掉 ZooKeeper 的計畫推很久了 (好幾年了),但進展不快,所以有時候會發現在 command line 下,有些指令會把 API endpoint 指到 ZooKeeper 伺服器上,但有些指令卻又指到 Kafka broker 上,這點一直在邏輯上困擾很久,直到看到官方的拔除計畫 (但又不快) 才理解為什麼這麼不一致...

給需要的人參考,當初在架設 Kafka cluster 時寫下來的筆記:「Confluent」。

Amazon MQ 在 ap-northeast-{1,2} 推出了...

Amazon MQap-northeast-{1,2} 推出了,先前自己架的,現在可以直接拿現成的服務了:「Amazon MQ is Now Available in the Asia Pacific (Seoul) and Asia Pacific (Tokyo) regions」。

不過 AWS 上的開發主要還是以 SQS 之類的服務為主,可以避免 scalability 的問題 (另外一種可能是一開始就打定要搬出來,所以選擇 open protocol 的方案)。在這樣的前提下,Amazon MQ 的定位就變成將現有軟體丟上去跑... (而不想自己管 XD)

SQS 可以打進 Lambda 了...

在昨天的 AWS 台北高峰會上,AWS 的人有提到這個功能應該要正式推出了,果然在回來不久後就看到消息了:「AWS Lambda Adds Amazon Simple Queue Service to Supported Event Sources」。

We can now use Amazon Simple Queue Service (SQS) to trigger AWS Lambda functions! This is a stellar update with some key functionality that I’ve personally been looking forward to for more than 4 years. I know our customers are excited to take it for a spin so feel free to skip to the walk through section below if you don’t want a trip down memory lane.

這算是 Serverless 架構下很自然會想要做的一環,當 SQS 裡面有東西的時候就呼叫 Lambda 起來做事,以往一般會透過 SNS 在中間接起來 (或是拿 S3 惡搞,因為 S3 也可以串 Lambda...),現在可以直接串了:

By adding support for SQS to Lambda we’re removing a lot of the undifferentiated heavy lifting of running a polling service or creating an SQS to SNS mapping.

這個功能本身不收費,但他需要的 SQS API call 與產生的 Lambda 當然是需要收費的:

There are no additional charges for this feature, but because the Lambda service is continuously long-polling the SQS queue the account will be charged for those API calls at the standard SQS pricing rates.

AWS 推出用 ActiveMQ 架設的服務,Amazon MQ

AWSActiveMQ 包起來賣服務:「Amazon MQ – Managed Message Broker Service for ActiveMQ」。

在 AWS 上已經有 Amazon SQS 這類服務的情況下,應該還是因為 ActiveMQ 的生態更豐富,所以決定支援 ActiveMQ... 光是支援的通訊協定就比自家多很多,有很多應用可以直接接上去:

With Amazon MQ, you get direct access to the ActiveMQ console and industry standard APIs and protocols for messaging, including JMS, NMS, AMQP, STOMP, MQTT, and WebSocket.

不過支援的地區還是有限:

Amazon MQ is available now and you can start using it today in the US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), EU (Frankfurt), and Asia Pacific (Sydney) Regions.

另外這個服務有提供 free tier,可以讓使用者測試:

The AWS Free Tier lets you use a single-AZ micro instance for up to 750 hours and to store up to 1 gigabyte each month, for one year. After that, billing is based on instance-hours and message storage, plus charges Internet data transfer if the broker is accessed from outside of AWS.

Amazon SQS 支援 FIFO 了

Amazon SQS 支援 FIFO 了:「FIFO (First-In-First-Out) Queues」。新的 FIFO Queue 有保證順序,但也因此效能上有限制:

In addition to having all the capabilities of the standard queue, FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can't be tolerated. FIFO queues also provide exactly-once processing but are limited to 300 transactions per second (TPS).

可以看到舊版的 FAQ 對於 FIFO 的回答是 Standard Queue 會盡力做到 FIFO,但不保證:(出自 2016/08/26 的版本)

Q: Does Amazon SQS provide first-in-first-out (FIFO) access to messages?

Amazon SQS provides a loose-FIFO capability that attempts to preserve the order of messages. However, we have designed Amazon SQS to be massively scalable using a distributed architecture. Thus, we can't guarantee that you will always receive messages in the exact order you sent them (FIFO).

If your system requires the order of messages to be preserved, place sequencing information in each message so that messages can be ordered when they are received.

而現在則是名正言順的說有提供 FIFO 了:

Q: Does Amazon SQS provide message ordering?

Yes. FIFO (first-in-first-out) queues preserve the exact order in which messages are sent and received. If you use a FIFO queue, you don't have to place sequencing information in your messages. For more information, see FIFO Queue Logic in the Amazon SQS Developer Guide.

Standard queues provide a loose-FIFO capability that attempts to preserve the order of messages. However, because standard queues are designed to be massively scalable using a highly distributed architecture, receiving messages in the exact order they are sent is not guaranteed.

Netflix 開發的 Delayed Queue

原來這個叫做 Delayed Queue,難怪之前用其他關鍵字都找不到什麼資料... (就不講其他關鍵字了 XD)

Netflix 發表了他們自己所開發的 Delayed Queue:「Distributed delay queues based on Dynomite」。

本來的架構是用 Cassandra + Zookeeper 來做:

Traditionally, we have been using a Cassandra based queue recipe along with Zookeeper for distributed locks, since Cassandra is the de facto storage engine at Netflix.

但可以馬上想到不少問題,就如同 Netflix 提到的:

Using Cassandra for queue like data structure is a known anti-pattern, also using a global lock on queue while polling, limits the amount of concurrency on the consumer side as the lock ensures only one consumer can poll from the queue at a time.

所以就改放到 Netflix 另外開發的 Dynamite 上:

Dynomite, inspired by Dynamo whitepaper, is a thin, distributed dynamo layer for different storage engines and protocols. Currently these include Redis and Memcached. Dynomite supports multi-datacenter replication and is designed for high availability.

後端是 RedisMemcached 的系統,可以對抗整個機房從 internet 上消失的狀態。

在設計上則是「保證會跑一次」,也就是有可能會有多次的情況,用 Dyno Queues 系統的人必需要考慮進去:

4. At-least-once delivery semantics

雖然整篇講的頗輕鬆,但實際看起來還是很厚重... 暫時還是不會用吧 :o