linux – Page 18 – Gea-Suan Lin's BLOG

CVE-2015-7547：getaddrinfo() 的 RCE (Remote Code Execution) 慘案

Google 寫了一篇關於 CVE-2015-7547 的安全性問題：「CVE-2015-7547: glibc getaddrinfo stack-based buffer overflow」。

Google 的工程師在找 OpenSSH 連到某台特定主機就會 segfault 的通靈過程中，發現問題不在 OpenSSH，而是在更底層的 glibc 導致 segfault：

Recently a Google engineer noticed that their SSH client segfaulted every time they tried to connect to a specific host. That engineer filed a ticket to investigate the behavior and after an intense investigation we discovered the issue lay in glibc and not in SSH as we were expecting.

由於等級到了 glibc 這種每台 Linux 都有裝的情況，在不經意的情況下發生 segfault，表示在刻意攻擊的情況下可能會很糟糕，所以 Google 投入了人力研究，想知道這個漏洞到底可以做到什麼程度：

Thanks to this engineer’s keen observation, we were able determine that the issue could result in remote code execution. We immediately began an in-depth analysis of the issue to determine whether it could be exploited, and possible fixes. We saw this as a challenge, and after some intense hacking sessions, we were able to craft a full working exploit!

在研究過程中 Google 發現 Red Hat 的人也在研究同樣的問題：「(CVE-2015-7547) - In send_dg, the recvfrom function is NOT always using the buffer size of a newly created buffer (CVE-2015-7547)」：

In the course of our investigation, and to our surprise, we learned that the glibc maintainers had previously been alerted of the issue via their bug tracker in July, 2015. (bug). We couldn't immediately tell whether the bug fix was underway, so we worked hard to make sure we understood the issue and then reached out to the glibc maintainers. To our delight, Florian Weimer and Carlos O’Donell of Red Hat had also been studying the bug’s impact, albeit completely independently! Due to the sensitive nature of the issue, the investigation, patch creation, and regression tests performed primarily by Florian and Carlos had continued “off-bug.”

攻擊本身需要繞過反制機制 (像是 ASLR)，但仍然是可行的，Google 的人已經成功寫出 exploit code：

Remote code execution is possible, but not straightforward. It requires bypassing the security mitigations present on the system, such as ASLR. We will not release our exploit code, but a non-weaponized Proof of Concept has been made available simultaneously with this blog post.

技術細節在 Google 的文章裡也有提到，buffer 大小固定為 2048 bytes，但取得時有可能超過 2048 bytes，於是造成 buffer overflow：

glibc reserves 2048 bytes in the stack through alloca() for the DNS answer at _nss_dns_gethostbyname4_r() for hosting responses to a DNS query.

Later on, at send_dg() and send_vc(), if the response is larger than 2048 bytes, a new buffer is allocated from the heap and all the information (buffer pointer, new buffer size and response size) is updated.

另外 glibc 官方的 mailing list 上也有說明：「[PATCH] CVE-2015-7547 --- glibc getaddrinfo() stack-based buffer overflow」。

試玩 LXD

LXD 是 Canonical (Ubuntu 的那家公司) 推的 container 系統，在「Super Fast Local Workloads With LXD, ZFS, and Juju」這篇文章裡雖然是提 ZFS + Juju 這兩個東西，但 LXD 的部份還是給了些可以直接拿來用的資訊。

首先先安裝 LXD，我是裝 ppa:ubuntu-lxc/stable 這個版本，裝完 lxd 後就照著先執行：

$ newgrp lxd
$ lxd init

由於沒有裝 zfs，就用 dir 模式跑就好了。網路的部份就先選 no 混過去，反正 NAT 會通... 接著就拉 image 回來：

$ lxd-images import ubuntu trusty amd64 --sync --alias ubuntu-trusty

拉完後就可以跑起來了：

$ lxc launch ubuntu-trust test
$ lxc exec test /bin/bash

直接打 lxc 也可以看到一些說明，用過 Docker 的人應該是沒什麼問題，還蠻簡單的。

沒有檢查 TCP checksum 的 bug 造成的慘案

Twitter 家的工程師努力通靈找靈異現象，最後發現是 kernel bug 造成 veth 沒檢查 TCP checksum 造成的慘案：「Linux kernel bug delivers corrupt TCP/IP data to Mesos, Kubernetes, Docker containers」。

而隔壁棚 PagerDuty 在 2015 年五月也有遇到類似的問題，不過當時看起來沒找出 root cause，只有提出 workaround 解法暫時避開：「The Discovery of Apache ZooKeeper’s Poison Packet」。

這個 bug 已經被 patch 掉了，之後應該會再 backport 回到舊版 kernel：

I’m really impressed with the linux netdev group and kernel maintainers in general; code reviews were quite prompt and our patch was merged in within a few weeks, and was back-ported to older (3.14+) -stable queues on various kernel distributions (Canonical, Suse) within a month.

文章中間有寫找 bug 的過程，可以看到都是在通靈...

Docker 預定將 Official Image 從 Ubuntu 換到 Alpine Linux

在「Docker Official Images are Moving to Alpine Linux」這邊看到的消息，消息引用自 Hacker News 上的討論串：「CoreOS Overview, Part One (deis.com)」，其中 shykes 這樣說：

Disclaimer: I work at Docker.

Incidentay, we have hired Natanael Copa, the awesome creator of Alpine Linux and are in the process of switching the Docker official image library from ubuntu to Alpine. You can help us with pull requests to https://github.com/docker-library if you want :)

所以已經雇用 Alpine 的創辦人，並且開始把 Docker 所使用的 Ubuntu-based image 換成 Alpine-based image，後續應該會有更多消息 (& 更正式的消息) 被放出來...

Google 的 Load balancer：Seesaw

前幾天因為流感而睡太多，來消化一些文章。

上個星期 Google 放出一套用 Go 寫的 Load balancer，叫 Seesaw：「Seesaw: scalable and robust load balancing」。

比較有趣的是 BGP 與 anycast VIP 的能力：

Seesaw v2 provides full support for anycast VIPs - that is, it will advertise an anycast VIP when it becomes available and will withdraw the anycast VIP if it becomes unavailable.

Google 這個規模玩的是不同 scale 的花樣...

超小的 Docker Image

在 Hacker News Daily 上看到「Super small Docker image based on Alpine Linux」這個專案，看了一下是 BusyBox 類的專案，不過套件支援度比起其他 BusyBox 專案多不少。

基於 Alpine Linux 的系統：

Alpine Linux is an independent, non-commercial, general purpose Linux distribution designed for power users who appreciate security, simplicity and resource efficiency.

musl libc 與 BusyBox：

Alpine Linux is built around musl libc and busybox. This makes it smaller and more resource efficient than traditional GNU/Linux distributions. A container requires no more than 8 MB and a minimal installation to disk requires around 130 MB of storage. Not only do you get a fully-fledged Linux environment but a large selection of packages from the repository.

可以拿來玩看看，不過一般狀態下應該還是會拿 Ubuntu 或 Debian 的系統來用吧，環境標準多了。(不需要自己花時間找問題)

AWS 的 EC2 Run Command 功能支援 Linux Instance

也是 AWS 在年假前丟出來的功能：「EC2 Run Command Update – Now Available for Linux Instances」。

原先只能用在 Windows Instance 上，現在則支援 Linux Instance 了：

When we launched EC2 Run Command seven weeks ago (see my post, New EC2 Run Command – Remote Instance Management at Scale to learn more), I promised similar functionality for instances that run Linux. I am happy to be able to report that this functionality is available now and that you can start using it today.

跟一般 SSH 登入不一樣的地方在於這個功能可以被 AWS IAM 的系統稽核，所有透過這個功能下的指令都會被 AWS 記錄起來。

Google Chrome 將在明年三月停止 32bits Linux 版本支援

在「Google ends 32-bit Linux support for Chrome」這邊看到新聞，引用自「Updates to Google Chrome Linux support」這邊的消息：

To provide the best experience for the most-used Linux versions, we will end support for Google Chrome on 32-bit Linux, Ubuntu Precise (12.04), and Debian 7 (wheezy) in early March, 2016. Chrome will continue to function on these platforms but will no longer receive updates and security fixes.

We intend to continue supporting the 32-bit build configurations on Linux to support building Chromium. If you are using Precise, we’d recommend that you to upgrade to Trusty.

既然還是會支援 32bits 的情況 (透過 Chromium)，到時候應該會有 PPA 出來頂著讓大家用？

nginx 的 TCP Fast Open

在「Enabling TCP Fast Open for NGINX on CentOS 7」這邊看到 nginx 對 TCP Fast Open (TFO，RFC 7413) 的支援早在 1.5.8 就有了，而 Linux Kernel 也是 3.7 之後就全面支援了。

TCP Fast Open 利用第一次連線後產生的 TCP cookie，在第二次連線時可以在 3-way handshake 的過程就開始傳輸，藉此大幅降低 latency。

設定方法不難，先在 kernel 設定 net.ipv4.tcp_fastopen=3，再加上 fastopen=number 就可以了，像是這樣：

listen 80 fastopen=256

不過目前 NGINX Mainline 上的版本好像沒有編進去，暫時沒辦法測...

Ubuntu 在 Command Line 下自動重撥 PPPoE

HiNet 的 PPPoE 大約三四天會斷一次，但就算設定要自動重撥好像也不太會動，所以需要自己偵測 ppp0 界面是否存在，不是的話就要撥號...

測試 ppp0 界面是否存在可以用 ifconfig 的 exit status 判斷，而重撥則可以用 nmcli 來做，用 cron 去判斷變成：

*/1 * * * * root /sbin/ifconfig ppp0 > /dev/null 2>&1 || /usr/bin/nmcli connection up id "HiNet PPPoE" > /dev/null 2>&1

我是用 "HiNet PPPoE" 這個名稱，如果要用到你自己的機器上的話，把上面的 "HiNet PPPoE" 換成你在 NetworkManager 裡設定的名稱。