llama.cpp 的載入速度加速

Hacker News 上看到「Llama.cpp 30B runs with only 6GB of RAM now (github.com/ggerganov)」這個消息,原 pull request 在「Make loading weights 10-100x faster #613」這邊。

這個 PR 的作者 Justine Tunney 在 PR 上有提到他改變 model 檔案格式,以便改用 mmap(),大幅降低了需要預先讀取的時間 (因為變成 lazy-loading style),而且這也讓系統可以利用 cache page,避免了 double buffering 的問題:

This was accomplished by changing the file format so we can mmap() weights directly into memory without having to read() or copy them thereby ensuring the kernel can make its file cache pages directly accessible to our inference processes; and secondly, that the file cache pages are much less likely to get evicted (which would force loads to hit disk) because they're no longer competing with memory pages that were needlessly created by gigabytes of standard i/o.

這讓我想到在資料庫領域中,PostgreSQL 也會用 mmap() 操作,有點類似的概念。

另外 Justine Tunney 在這邊的 comment 有提到一個意外觀察到的現象,他發現實際在計算的時候用到的 model 內容意外的少:他用一個簡單的 prompt 測試,發現 20GB 的 30B model 檔案在他的 Intel 機器上實際只用到了 1.6GB 左右:

If I run 30B on my Intel machine:


As we can see, 400k page faults happen, which means only 1.6 gigabytes ((411522 * 4096) / (1024 * 1024)) of the 20 gigabyte weights file actually needed to be used.

這點他還在懷疑是不是他的修改有 bug,但目前他覺得不太像,也看不太出來:

Now, since my change is so new, it's possible my theory is wrong and this is just a bug. I don't actually understand the inner workings of LLaMA 30B well enough to know why it's sparse. Maybe we made some kind of rare mistake where llama.cpp is somehow evaluating 30B as though it were the 7B model. Anything's possible, however I don't think it's likely. I was pretty careful in writing this change, to compare the deterministic output of the LLaMA model, before and after the Git commit occurred. I haven't however actually found the time to reconcile the output of LLaMA C++ with something like PyTorch. It'd be great if someone could help with that, and possibly help us know why, from more a data science (rather than systems engineering perspective) why 30B is sparse.

如果不是 bug 的話,這其實冒出了一個很有趣的訊號,表示這些 model 是有可能再瘦身的?

更新 Sudo (CVE-2021-3156)

Sudo 這次的安全性漏洞頗痛的:「CVE-2021-3156: Heap-Based Buffer Overflow in Sudo (Baron Samedit)」。

依照 Sudo Security Alerts 這邊的說明,這次的漏洞只要是本機有執行權限的人都有機會打穿,不需要有 sudo 帳號權限:

A potential security issue exists in sudo that could be used by a local user to gain root privileges even when not listed in the sudoers file. Affected sudo versions are 1.8.2 through 1.8.31p2 and 1.9.0 through 1.9.5p1. Sudo 1.9.5p2 and above are not affected.

不限於本來就有 sudo 帳號 (但可能可執行指令受限) 就比較麻煩了,這代表從 web 之類的管道打進去以後 (可能是 www-data 身份),就可以用這個洞取得 root 權限,另外一條路是透過資料庫打進去後 (像是 mysql 身份) 取得。


繞過 Screensaver Lock 的有趣話題...

Hacker News Daily 上看到「Screensaver lock by-pass via the virtual keyboard」這篇,裡面這邊題到了 screensaver lock 的有趣話題。

先講嚴肅一點的,這個 bug 被編號為 CVE-2020-25712,問題出在 xorg-x11-server 上:

A flaw was found in xorg-x11-server before 1.20.10. A heap-buffer overflow in XkbSetDeviceInfo may lead to a privilege escalation vulnerability. The highest threat from this vulnerability is to data confidentiality and integrity as well as system availability.

比較有趣的事情是,這個 bug 是小朋友在亂玩時拉出 virtual keyboard 觸發的:

A few weeks ago, my kids wanted to hack my linux desktop, so they typed and clicked everywhere, while I was standing behind them looking at them play... when the screensaver core dumped and they actually hacked their way in! wow, those little hackers...


I tried to recreate the crash on my own with no success, maybe because it required more than 4 little hands typing and using the mouse on the virtual keyboard.

另外一個人也說他家小朋友也弄出 segfault 了:

My kids came upon a similar cinnamon-screensaver segfault! I've emailed details of how to reproduce the problem to root@linuxmint.com.

小朋友超強 XDDD

最近 OpenVPN 的安全性漏洞...

看到「The OpenVPN post-audit bug bonanza」這個只有苦笑啊...

作者在 OpenVPN 經過一連串的安全加強後 (包括 harden 計畫與兩個外部單位的程式碼稽核找到不少問題),決定出手挖看看:

After a hardening of the OpenVPN code (as commissioned by the Dutch intelligence service AIVD) and two recent audits 1 2, I thought it was now time for some real action ;).


可以看到作者透過 fuzzing 打出一卡車,包含了不少 crash XDDD:(然後有一個是 stack buffer corruption,不知道有沒有機會變成 RCE)

  • Remote server crashes/double-free/memory leaks in certificate processing (CVE-2017-7521)
  • Remote (including MITM) client crash, data leak (CVE-2017-7520)
  • Remote (including MITM) client stack buffer corruption
  • Remote server crash (forced assertion failure) (CVE-2017-7508)
  • Crash mbed TLS/PolarSSL-based server (CVE-2017-7522)
  • Stack buffer overflow if long –tls-cipher is given

iOS 透過無線網路的 RCE...

在「About the security content of iOS 10.3.1」這邊的說明:

Available for: iPhone 5 and later, iPad 4th generation and later, iPod touch 6th generation and later
Impact: An attacker within range may be able to execute arbitrary code on the Wi-Fi chip
Description: A stack buffer overflow was addressed through improved input validation.
CVE-2017-6975: Gal Beniamini of Google Project Zero


InnoDB 的 buffer pool preload 功能

Percona 的人討論了 InnoDB 提供的 buffer pool preload 功能:「Using the InnoDB Buffer Pool Pre-Load Feature in MySQL 5.7」。

就如同他所講的,因為硬體設備的進步 (主要是 SSD 的興起),而導致 preload 的需求已經沒以前重要了:

Frankly, time has reduced the need for this feature. Five years ago, we would typically store databases on spinning disks. These disks often took quite a long time to warm up with normal database workloads, which could lead to many hours of poor performance after a restart. With the rise of SSDs, warm up happens faster and reduces the penalty from not having data in the buffer pool.

由於 SSD 的 random read 很快,反而可以直接推上線讓他邊跑服務邊 warm up。不過相對的,傳統硬碟的 InnoDB database 還是可以規劃需求,畢竟 random read 還是痛點...

MySQL 5.6 到 5.7 改變的預設值

Percona 整理了一份 MySQL 5.6 到 5.7 改變的預設值,對於評估與轉移的人都很有用:「MySQL Default Configuration Changes between 5.6 and 5.7」。

sync_binlog 居然從 0 改成 1 了,這對效能的影響應該不少。

performance_schema_* 有不少改成自動調整了,可以省下不少功夫。

innodb_buffer_pool_dump_at_shutdowninnodb_buffer_pool_load_at_startup 都打開了,這避免了正常重啟時的 warm up 問題,不過在存在有效的手段可以手動 warm up 的時,應該還是會關掉吧。(參考 2013 的文章「熱 MySQL InnoDB 的方式...」)

另外介紹了 InnoDB 預設格式的改變,這點到是因為使用 COMPRESSED,反而不太受到影響。

CVE-2015-7547:getaddrinfo() 的 RCE (Remote Code Execution) 慘案

Google 寫了一篇關於 CVE-2015-7547 的安全性問題:「CVE-2015-7547: glibc getaddrinfo stack-based buffer overflow」。

Google 的工程師在找 OpenSSH 連到某台特定主機就會 segfault 的通靈過程中,發現問題不在 OpenSSH,而是在更底層的 glibc 導致 segfault:

Recently a Google engineer noticed that their SSH client segfaulted every time they tried to connect to a specific host. That engineer filed a ticket to investigate the behavior and after an intense investigation we discovered the issue lay in glibc and not in SSH as we were expecting.

由於等級到了 glibc 這種每台 Linux 都有裝的情況,在不經意的情況下發生 segfault,表示在刻意攻擊的情況下可能會很糟糕,所以 Google 投入了人力研究,想知道這個漏洞到底可以做到什麼程度:

Thanks to this engineer’s keen observation, we were able determine that the issue could result in remote code execution. We immediately began an in-depth analysis of the issue to determine whether it could be exploited, and possible fixes. We saw this as a challenge, and after some intense hacking sessions, we were able to craft a full working exploit!

在研究過程中 Google 發現 Red Hat 的人也在研究同樣的問題:「(CVE-2015-7547) - In send_dg, the recvfrom function is NOT always using the buffer size of a newly created buffer (CVE-2015-7547)」:

In the course of our investigation, and to our surprise, we learned that the glibc maintainers had previously been alerted of the issue via their bug tracker in July, 2015. (bug). We couldn't immediately tell whether the bug fix was underway, so we worked hard to make sure we understood the issue and then reached out to the glibc maintainers. To our delight, Florian Weimer and Carlos O’Donell of Red Hat had also been studying the bug’s impact, albeit completely independently! Due to the sensitive nature of the issue, the investigation, patch creation, and regression tests performed primarily by Florian and Carlos had continued “off-bug.”

攻擊本身需要繞過反制機制 (像是 ASLR),但仍然是可行的,Google 的人已經成功寫出 exploit code:

Remote code execution is possible, but not straightforward. It requires bypassing the security mitigations present on the system, such as ASLR. We will not release our exploit code, but a non-weaponized Proof of Concept has been made available simultaneously with this blog post.

技術細節在 Google 的文章裡也有提到,buffer 大小固定為 2048 bytes,但取得時有可能超過 2048 bytes,於是造成 buffer overflow:

glibc reserves 2048 bytes in the stack through alloca() for the DNS answer at _nss_dns_gethostbyname4_r() for hosting responses to a DNS query.

Later on, at send_dg() and send_vc(), if the response is larger than 2048 bytes, a new buffer is allocated from the heap and all the information (buffer pointer, new buffer size and response size) is updated.

另外 glibc 官方的 mailing list 上也有說明:「[PATCH] CVE-2015-7547 --- glibc getaddrinfo() stack-based buffer overflow」。

libpng 漏洞...

libpng 的安全性問題,CVE-2015-8126

Multiple buffer overflows in the (1) png_set_PLTE and (2) png_get_PLTE functions in libpng before 1.0.64, 1.1.x and 1.2.x before 1.2.54, 1.3.x and 1.4.x before 1.4.17, 1.5.x before 1.5.24, and 1.6.x before 1.6.19 allow remote attackers to cause a denial of service (application crash) or possibly have unspecified other impact via a small bit-depth value in an IHDR (aka image header) chunk in a PNG image.


Berkeley DB 的介紹

在滿滿都是 NoSQL 的世代中,意外在「Berkeley DB: Architecture」這邊看到 Berkeley DB 的介紹...

2006 年 Berkeley DB 的公司 SleepycatOracle 收購。在收購後 Oracle 改變了 open source 授權部份,從之前的 Sleepycat License 改成了 AGPLv3

Berkeley DB 算是早期功能很完整的 database library,由於 page level locking、crash-safe 加上有 transaction,也曾經被 MySQL 拿去當作 engine,不過在 MySQL 5.1 被拔掉:「14.5 The BDB (BerkeleyDB) Storage Engine」。

文章裡講了很多底層設計上的想法 (而非單純只說明「做了什麼」),以四個面向來討論。Buffer、Lock、Log 以及 Transaction,並且圍繞著 ACID 需求討論。

算是懷念的考古文?Google 弄出來的 LevelDBFacebook 接著改善的 RocksDB 的走向也不太一樣了,現在大家對 ACID 需求因為 NoSQL 盛行的關係又重新在檢視...