Yelp 對 MySQL 更新的資料送到 Kafka 的作法

Apache Kafka 是個 pub-sub 系統:

Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.

Yelp 的人想要將 MySQL 的更新資訊送一份到 Kafka 就可以做很多應用。文章前面介紹了很多原理以及理論,像是講 MySQL 的 replication:

但讀這篇文章發現重點在於他介紹了 GitHub 上的「noplay/python-mysql-replication」這個專案:

Our stream reader is an abstraction over the BinLogStreamReader from the python-mysql-replication package.

這個專案可以解析 MySQL 的 replication protocol:

Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL. This allow you to receive event like insert, update, delete with their datas and raw SQL queries.

馬上就感覺到可以透過這個 library 做不少事情,像是直接接到 worker,再更新 Elasticsearch 上的資料,這樣就是 100% 確保不會漏更新...

飯店取得好評價的方法?

為什麼這種方法讓我想到某組台北市長候選人的團隊呢...

從「Hotel fines $500 for every bad review posted online」這篇看到的消息,要再 Yelp 上維持好評價的方式:

The Union Street Guest House, near Catskills estates built by the Vanderbilts and Rockefellers, charges couples who book weddings at the venue $500 for every bad review posted online by their guests.

If you take down the nasty review, you’ll get your money back.

這方法好讚啊 XDDD

自從報導出來後,Yelp 上的頁面「Union Street Guest House - Hudson, NY | Yelp」就透露出這家飯店的真實了... 一戰成名啊 XDDD