PostgreSQL 對 fsync() 的修正

上次寫了「PostgreSQL 對 fsync() 的行為傷腦筋...」提到 fsync() 有些地方是與開發者預期不同的問題,但後面忘記跟進度...

剛剛看到 Percona 的人寫了「PostgreSQL fsync Failure Fixed – Minor Versions Released Feb 14, 2019」這篇才發現在 2/14 就出了對應的更新,從 release notes 也可以看到:

By default, panic instead of retrying after fsync() failure, to avoid possible data corruption (Craig Ringer, Thomas Munro)

Some popular operating systems discard kernel data buffers when unable to write them out, reporting this as fsync() failure. If we reissue the fsync() request it will succeed, but in fact the data has been lost, so continuing risks database corruption. By raising a panic condition instead, we can replay from WAL, which may contain the only remaining copy of the data in such a situation. While this is surely ugly and inefficient, there are few alternatives, and fortunately the case happens very rarely.

A new server parameter data_sync_retry has been added to control this; if you are certain that your kernel does not discard dirty data buffers in such scenarios, you can set data_sync_retry to on to restore the old behavior.

現在的 workaround 是遇到 fsync() 失敗時為了避免 data corruption,會直接 panic 讓整個 PostgreSQL 從 WAL replay 記錄,也代表 HA 機制 (如果有設計的話) 有機會因為這個原因被觸發...

不過也另外設計了 data_sync_retry,讓 PostgreSQL 的管理者可以硬把這個 panic 行為關掉,改讓 PostgreSQL 重新試著 fsync(),這應該是在之後 kernel 有修改時會用到...

Leave a Reply

Your email address will not be published.