在 Hacker News 上看到「The Dirty Pipe Vulnerability」這個 Linux kernel 的安全性問題,Hacker News 上相關的討論在「The Dirty Pipe Vulnerability (cm4all.com)」這邊可以看到。
這次出包的是 splice() 的問題,先講他寫出可重製 bug 的程式碼,首先是第一個程式用 user1 放著跑:
#include <unistd.h> int main(int argc, char **argv) { for (;;) write(1, "AAAAA", 5); } // ./writer >foo
然後第二個程式也放著跑 (可以是不同的 user2,完全無法碰到 user1 的權限):
#define _GNU_SOURCE #include <unistd.h> #include <fcntl.h> int main(int argc, char **argv) { for (;;) { splice(0, 0, 1, 0, 2, 0); write(1, "BBBBB", 5); } } // ./splicer <foo |cat >/dev/null
理論上不會在 foo
裡面看到任何 BBBBB
的字串,但卻打穿了... 透過 git bisect 的檢查,他也確認了是在「pipe: merge anon_pipe_buf*_ops」這個 commit 時出的問題。
不過找到問題的過程拉的頗長,一開始是有 web hosting 服務的 support ticket 說 access log 下載下來發現爛掉了,無法解壓縮:
It all started a year ago with a support ticket about corrupt files. A customer complained that the access logs they downloaded could not be decompressed. And indeed, there was a corrupt log file on one of the log servers; it could be decompressed, but gzip reported a CRC error.
然後他先手動處理就把票關起來了:
I fixed the file’s CRC manually, closed the ticket, and soon forgot about the problem.
接下來過幾個月後又發生,經過幾次的 support ticket 後他手上就有一些「資料」可以看:
Months later, this happened again and yet again. Every time, the file’s contents looked correct, only the CRC at the end of the file was wrong. Now, with several corrupt files, I was able to dig deeper and found a surprising kind of corruption. A pattern emerged.
然後因為發生的頻率也不是很高,加上邏輯上卡到死胡同,所以他也沒有辦法花太多時間在上面:
None of this made sense, but new support tickets kept coming in (at a very slow rate). There was some systematic problem, but I just couldn’t get a grip on it. That gave me a lot of frustration, but I was busy with other tasks, and I kept pushing this file corruption problem to the back of my queue.
後來真的花時間下去找,利用先前的 pattern 掃了一次系統 log,發現有規律在:
External pressure brought this problem back into my consciousness. I scanned the whole hard disk for corrupt files (which took two days), hoping for more patterns to emerge. And indeed, there was a pattern:
- there were 37 corrupt files within the past 3 months
- they occurred on 22 unique days
- 18 of those days have 1 corruption
- 1 day has 2 corruptions (2021-11-21)
- 1 day has 7 corruptions (2021-11-30)
- 1 day has 6 corruptions (2021-12-31)
- 1 day has 4 corruptions (2022-01-31)
The last day of each month is clearly the one which most corruptions occur.
然後就試著寫各種 reproducible code,最後成功的版本就是開頭提到的,然後他發現這個漏洞可以是 security vulnerability,就回報出去了,可以看到前後從第一次的 support ticket 到最後解決花了快一年的時間,不過 Linux kernel 端修正的速度蠻快的:
- 2021-04-29: first support ticket about file corruption
- 2022-02-19: file corruption problem identified as Linux kernel bug, which turned out to be an exploitable vulnerability
- 2022-02-20: bug report, exploit and patch sent to the Linux kernel security team
- 2022-02-21: bug reproduced on Google Pixel 6; bug report sent to the Android Security Team
- 2022-02-21: patch sent to LKML (without vulnerability details) as suggested by Linus Torvalds, Willy Tarreau and Al Viro
- 2022-02-23: Linux stable releases with my bug fix (5.16.11, 5.15.25, 5.10.102)
- 2022-02-24: Google merges my bug fix into the Android kernel
- 2022-02-28: notified the linux-distros mailing list
- 2022-03-07: public disclosure
整個故事還蠻精彩的 XD