Apache Kafka 是個 pub-sub 系統:
Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.
而 Yelp 的人想要將 MySQL 的更新資訊送一份到 Kafka 就可以做很多應用。文章前面介紹了很多原理以及理論,像是講 MySQL 的 replication:
但讀這篇文章發現重點在於他介紹了 GitHub 上的「noplay/python-mysql-replication」這個專案:
Our stream reader is an abstraction over the BinLogStreamReader from the python-mysql-replication package.
這個專案可以解析 MySQL 的 replication protocol:
Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL. This allow you to receive event like insert, update, delete with their datas and raw SQL queries.
馬上就感覺到可以透過這個 library 做不少事情,像是直接接到 worker,再更新 Elasticsearch 上的資料,這樣就是 100% 確保不會漏更新...