Home » Posts tagged "purge"

EnterpriseDB 打算推出的 zheap,想要解 VACUUM 問題...

前天被問到「DO or UNDO - there is no VACUUM」這篇,回家後仔細看一看再翻了一些資料,看起來是要往 InnoDB 的解法靠...

PostgreSQL 與 InnoDB 都是透過 MVCC 的概念實做 transaction 之間的互動,但兩者實際的作法不太一樣。其中帶來一個明顯的差異就是 PostgreSQL 需要 VACUUM。這點在同一篇作者八年前 (2011) 的文章就有提過兩者的差異以及優缺點:「MySQL vs. PostgreSQL, Part 2: VACUUM vs. Purge」。

UPDATE 時,InnoDB 會把新資料寫到表格內,然後把可能會被 rollback 的舊資料放到表格外:

In InnoDB, only the most recent version of an updated row is retained in the table itself. Old versions of updated rows are moved to the rollback segment, while deleted row versions are left in place and marked for future cleanup. Thus, purge must get rid of any deleted rows from the table itself, and clear out any old versions of updated rows from the rollback segment.

而被 DELETE 清除的資料則是由 purge thread 處理:

All the information necessary to find the deleted records that might need to be purged is also written to the rollback segment, so it's quite easy to find the rows that need to be cleaned out; and the old versions of the updated records are all in the rollback segment itself, so those are easy to find, too.

所以可以在 InnoDB 看到 purge thread 相關的設定:「MySQL :: MySQL 5.7 Reference Manual :: 14.6.11 Configuring InnoDB Purge Scheduling」,負責處理這些東西。

而在 PostgreSQL 的作法則是反過來,舊的資料放在原來地方,新資料另外存:

PostgreSQL takes a completely different approach. There is no rollback tablespace, or anything similar. When a row is updated, the old version is left in place; the new version is simply written into the table along with it.

新舊資料的位置其實還好,主要是因為沒有類似的地方可以記錄哪些要清:

Lacking a centralized record of what must be purged, PostgreSQL's VACUUM has historically needed to scan the entire table to look for records that might require cleanup.

這也使得 PostgreSQL 裡需要 autovacuum 之類的程序去掃,或是手動跑 vacuum。而在去年 (2017) 的文章裡也有提到目前還是類似的情況:「MVCC and VACUUM」。

而在今年 (2018) 的文章裡,EnterpriseDB 就提出了 zheap 的想法,在 UPDATE 時寫到 table 裡,把可能被 rollback 的資料放到 undo log 裡。其實就是把 InnoDB 那套方法拿過來用,只是整篇都沒提到而已 XD:

That brings me to the design which EnterpriseDB is proposing. We are working to build a new table storage format for PostgreSQL, which we’re calling zheap. In a zheap, whenever possible, we handle an UPDATE by moving the old row version to an undo log, and putting the new row version in the place previously occupied by the old one. If the transaction aborts, we retrieve the old row version from undo and put it back in the original location; if a concurrent transaction needs to see the old row version, it can find it in undo. Of course, this doesn’t work when the block is full and the row is getting wider, and there are some other problem cases as well, but it covers many useful cases. In the typical case, therefore, even bulk updates do not force a zheap to grow. Instead, the undo grows. When a transaction commits, all row versions that will become dead are in the undo, not the zheap.

不過馬上就會想到問題,如果要改善問題,不是個找地方記錄哪些位置要回收就好了嗎?順便改變方法是為了避免 fragment 嗎?

等著看之後變成什麼樣子吧...

小台機器上的 innodb_purge_threads 對效能的影響

雖然「MyISAM, small servers and sysbench at low concurrency」這篇標題是在講 MySQL 上的 MyISAM,但還是有提到一些 InnoDB 的東西...

其中提到了 innodb_purge_threads 對效能的影響:

the default value for innodb_purge_threads, which is 4, can cause too much mutex contention and a loss in QPS on small servers. For sysbench update-only I lose 25% of updates/second with 5.7.17 and 15% with 8.0.1 when going from innodb_purge_threads=1 to =4.

當機器不大的時候,innodb_purge_threads 對於效能帶來的影響其實頗大的?

另外從作者最近的一系列測試看起來,5.7 在小機器的效能比 5.6 差不少... 這點在考慮 RDS 的時候也許要注意 (因為 t2.* 應該不算大 XD)。

CloudFlare 推出 tag-based purge 功能

CloudFlare 推出這個功能很棒啊,不過這後面的資料結構必須設計的夠好才能這樣玩:「Introducing a Powerful Way to Purge Cache on CloudFlare: Purge by Cache-Tag」。

cache purge 一直都是 CDN 的痛處,用過的每一家 CDN 都在比慢的,大概都是 10 mins 起跳,但 CloudFlare 在這塊花了不少功夫:

可以想像到的方式是放入 user_gslin (在使用者停用時可以馬上更新) 或是 pic_id_1234567890 (可以針對各種縮圖一次刷新) 這樣的 tag 去歸類,然後就可以大規模刷...

Varnish 的 Super Fast Purger...

Reverse Proxy 的 Cache Infrastructure 在遇到 cache invalidate 都是很討厭的問題,不是不能做,而是效能不太好... 常見的作法是設計成不用 purge 的形式,只要是需要更新,就產生不同的 url,而舊的 url 在沒人 access 後會透過各種 Cache algorithms 自動回收掉,像是 LRU (Least Recently Used) 之類的演算法。

發展久了之後也因此衍伸出很多不同的架構,像是 groupcache 就是假設在同一個 address 的內容永遠不會變的前提。

Varnish Cache 這次發表的東西則是打算從根本問題解決,也就是想辦法讓 purge (cache invalidate) 的成本降低:「Simple scales better and faster in the real world」、「VAC 2.0.3 with high performance cache invalidation API (aka the Super Fast Purger)」。

官方的說法,在大台機器上可以到 60k reqs/sec:

Kristian nonchalantly mentioned that the Super Fast Purger did 60,000 requests per second, on a 6 core Xeon with 36GB memory, traffic over a gigabit network to a single Varnish Cache server, with httperf as test client. But we believe the Super Fast Purger can do a lot more with a little love and tuning.

Squid 效能不好,ATS 的文件很傷,是該找時間來測試看看...

Archives