GitHub 重新定位 Redis 的功能...

GitHub Engineering 說明了他們為什麼改變 Redis 的使用情境:「Moving persistent data out of Redis」。

GitHub 裡面,Redis 有兩種不同的情境,一種叫做 transient Redis,只用做 cache:

We used it as an LRU cache to conveniently store the results of expensive computations over data originally persisted in Git repositories or MySQL. We call this transient Redis.

另外一種則是打開 persistence 功能,叫做 persistent Redis:

We also enabled persistence, which gave us durability guarantees over data that was not stored anywhere else. We used it to store a wide range of values: from sparse data with high read/write ratios, like configuration settings, counters, or quality metrics, to very dynamic information powering core features like spam analysis. We call this persistent Redis.

這邊講的是 persistent Redis 被換成用 MySQL (InnoDB) 儲存:

Recently we made the decision to disable persistence in Redis and stop using it as a source of truth for our data. The main motivations behind this choice were to:

  • Reduce the operational cost of our persistence infrastructure by removing some of its complexity.
  • Take advantage of our expertise operating MySQL.
  • Gain some extra performance, by eliminating the I/O latency during the process of writing big changes on the server state to disk.

For the majority of callsites, we replaced persistent Redis with GitHub::KV, a MySQL key/value store of our own built atop InnoDB, with features like key expiration. We were able to use GitHub::KV almost identically as we used Redis: from trending repositories and users for the explore page, to rate limiting to spammy user detection.

後面講了不少轉換的過程 (還包含了某些功能的改寫),但沒有講的太清楚為什麼不繼續使用 Redis。

目前只能就提到的三點問題來看,persistent 的 i/o 成本可能太高?而且難以再壓榨效能出來?而相反的,InnoDB 已經花了很多力氣在上面,直接拿來用反而可以解決問題?

不過看得出來這個轉換還是花了不少力氣,看得出來有些 application 使用 Redis 的模式不能直接搬到 InnoDB 上,花了時間改寫...

2 thoughts on “GitHub 重新定位 Redis 的功能...”

  1. 寫備份檔這個動作會卡 query,等於是 redis 會不定時中斷一下,而且用量越大越容易踩到。

    主打 in memory 的 redis 效能會被硬碟速度卡住總覺得哪裡怪怪的…

Leave a Reply

Your email address will not be published. Required fields are marked *