Netflix 找到的 TCP 實做安全性問題...

這幾天的 Linux 主機都有收到 kernel 的更新,起因於 Netflix 發現並與社群一起修正了一系列 LinuxFreeBSD 上 TCP 實做 MSSSACK 的安全性問題:「https://github.com/Netflix/security-bulletins/blob/master/advisories/third-party/2019-001.md」。

其中最嚴重的應該是 CVE-2019-11477 這組,可以導致 Linux kernel panic,影響範圍從 2.6.29 開始的所有 kernel 版本。能夠升級的主機可以直接修正,無法升級的主機可以參考提出來的兩個 workaround:

Workaround #1: Block connections with a low MSS using one of the supplied filters. (The values in the filters are examples. You can apply a higher or lower limit, as appropriate for your environment.) Note that these filters may break legitimate connections which rely on a low MSS. Also, note that this mitigation is only effective if TCP probing is disabled (that is, the net.ipv4.tcp_mtu_probing sysctl is set to 0, which appears to be the default value for that sysctl).

Workaround #2: Disable SACK processing (/proc/sys/net/ipv4/tcp_sack set to 0).

第一個 workaround 是擋掉 MSS 過小的封包,但不保證就不會 kernel panic (文章裡面用語是 mitigation)。

第二個 workaround 是直接關掉 SACK,這組 workaround 在有 packet loss 的情況下效能會掉的比較明顯,但看起來可以避免直接 kernel panic...

只透過 SSH 重灌系統的 takeover.sh

看到 marcan/takeover.sh 這個專案:

Wipe and reinstall a running Linux system via SSH, without rebooting. You know you want to.

還蠻有趣的專案 (實用性質應該不高,主要是展示可行性),程式很短,就一個 C program 以及一個 shell script,但裡面用到很多外部工具 (像是 BusyBox 的執行檔),作者建議有興趣的人可以先用 virtual machine 玩...

這個專案讓我想到更複雜的 Depenguinator,把 Linux 灌成 FreeBSD 的專案,不過這個專案也已經很久沒更新了...

Amazon EC2 的可用頻寬提昇

AWSJeff Barr 宣佈了有 ENAEC2 instance 的頻寬提升到 25Gbps:「The Floodgates Are Open – Increased Network Bandwidth for EC2 Instances」。

分成三種,第一種是對 S3 的頻寬提昇:

EC2 to S3 – Traffic to and from Amazon Simple Storage Service (S3) can now take advantage of up to 25 Gbps of bandwidth. Previously, traffic of this type had access to 5 Gbps of bandwidth. This will be of benefit to applications that access large amounts of data in S3 or that make use of S3 for backup and restore.

第二種是 EC2 對 EC2 (內網):

EC2 to EC2 – Traffic to and from EC2 instances in the same or different Availability Zones within a region can now take advantage of up to 5 Gbps of bandwidth for single-flow traffic, or 25 Gbps of bandwidth for multi-flow traffic (a flow represents a single, point-to-point network connection) by using private IPv4 or IPv6 addresses, as described here.

第三種也是 EC2 對 EC2,但是是在同一個 Cluster Placement Group:

EC2 to EC2 (Cluster Placement Group) – Traffic to and from EC2 instances within a cluster placement group can continue to take advantage of up to 10 Gbps of lower-latency bandwidth for single-flow traffic, or 25 Gbps of lower-latency bandwidth for multi-flow traffic.

有 ENA 的有這些,好像沒看到 CentOS

ENA-enabled AMIs are available for Amazon Linux, Ubuntu 14.04 & 16.04, RHEL 7.4, SLES 12, and Windows Server (2008 R2, 2012, 2012 R2, and 2016). The FreeBSD AMI in AWS Marketplace is also ENA-enabled, as is VMware Cloud on AWS.

FreeBSD 上的 ccp (AMD Crypto Co-Processor)

看到 FreeBSD 上的「[base] Revision 328150」,將 AMD 的 AMD Crypto Co-Processor。

然後實測效能頗爛 XDDD 不過本來就不是以效能為主吧... 應該是以安全性與 Trusted Platform Module 考量?

像是 4KB buffer 的效能明顯比 AES-NI 慢了一大截 (少了一個零 XDDD):

aesni:      SHA1: ~8300 Mb/s    SHA256: ~8000 Mb/s
ccp:               ~630 Mb/s    SHA256:  ~660 Mb/s  SHA512:  ~700 Mb/s
cryptosoft:       ~1800 Mb/s    SHA256: ~1800 Mb/s  SHA512: ~2700 Mb/s

如果是 128KB buffer 時會好一些:

aesni:      SHA1:~10400 Mb/s    SHA256: ~9950 Mb/s
ccp:              ~2200 Mb/s    SHA256: ~2600 Mb/s  SHA512: ~3800 Mb/s
cryptosoft:       ~1750 Mb/s    SHA256: ~1800 Mb/s  SHA512: ~2700 Mb/s

然後 AES 也類似:

aesni:      4kB: ~11250 Mb/s    128kB: ~11250 Mb/s
ccp:               ~350 Mb/s    128kB:  ~4600 Mb/s
cryptosoft:       ~1750 Mb/s    128kB:  ~1700 Mb/s

所以是 sponsor 有認證需要的關係嗎...

Sponsored by:Dell EMC Isilon

在 ThinkPad T530 上跑 FreeBSD 的介紹

作者在「FreeBSD on a Laptop」這邊寫下了在 ThinkPad T530 上跑 FreeBSD 的完整攻略。

查了一下 ThinkPad T530,這應該是 2012 年就推出的筆電了 (五年多前),所以文章的重點在於要去那邊找解法 (i.e. 方向性)。另外作者有提到文章是假設你已經對 FreeBSD 生態算熟悉 (像是 Ports 以及 /etc 下設定檔習慣的格式與設定方式):

Unlike my usual posts, this time I'm going to assume you're already pretty familiar with FreeBSD.

然後有點無奈的地方... 即使是 2012 年的電腦,為了 driver 問題他還是得跑 -CURRENT

In my case, I run 12-CURRENT so I can take advantage of the latest Intel drivers in the graphics/drm-next-kmod port.

這有點苦 XD

/usr/bin 下的工具介紹

Adventures in /usr/bin and the likes」這篇介紹了 /usr/bin 的各種工具。即使是在 FreeBSDLinux 下面混了許多年,還是看到了不少好用的工具,值得慶幸的是,至少有一個章節 (Misc) 還算熟悉...

OS 的 chrttaskset 看起來再壓榨效能的時候應該可以拿出來用。用
peekfd 不如記 strace 比較萬用...

Debugging 的 addr2line 則是學到了一招,對於還是 segfault 看起來應該會很有用...

後面 Data Manipulation 的部份其實都很值得再拿出來看,尤其是這一章的東西,沒在用常常會忘記...

Unix 程式碼演進的記錄

GitHub 上的「dspinellis/unix-history-repo」專案放進了 Unix 程式碼從 1970 年演進到 2016 年的記錄:

The history and evolution of the Unix operating system is made available as a revision management repository, covering the period from its inception in 1970 as a 2.5 thousand line kernel and 26 commands, to 2016 as a widely-used 27 million line system.

主要的目的是讓研究人員可以直接分析,減少重複的工作:

The project aims to put in the repository as much metadata as possible, allowing the automated analysis of Unix history.

後面的分支主要是以 FreeBSD 為主:(在列表的部份也可以看到)

It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, the University of California at Berkeley, and the 386BSD team, two legacy repositories, and the modern repository of the open source FreeBSD system.

整個 repository 頗壯觀的 XD

FreeBSD 11.0 重新發行...

本來已經發行的 FreeBSD 11.0,因為最近的安全性問題 (應該是講 OpenSSL?),決定不以 security update 的方式更新,而是直接發新版了:「[REVISED] [HEADS-UP] 11.0-RELEASE status update」。

Although the FreeBSD 11.0-RELEASE has not yet been officially announced, many have found images on the Project FTP mirrors.

However, please be aware the final 11.0-RELEASE will be rebuilt and republished on the Project mirrors as a result of a few last-minute security fixes we feel are imperative to include in the final release.

我以為應該是發 11.0.1,然後把架上的 11.0 拿掉...

Libreboot 成功的讓 FreeBSD/OpenBSD 開機

Libreboot 是一個 open source 版本的 BIOS/UEFI 替代品:

Libreboot is a free BIOS or UEFI replacement (free as in freedom); libre boot firmware that initializes the hardware and starts a bootloader for your operating system.

而最近的版本則是順利的在沒有修改作業系統來配合 Libreboot 的情況下將 FreeBSDOpenBSD (不過是 kOpenBSD) 開起來了:「Re: [Libreboot] GNU Libreboot, version 20160818 released」。

Netflix 對 sendfile() 在 TLS 情況下的加速

Netflix 對於寫了一篇關於隱私保護的技術細節:「Protecting Netflix Viewing Privacy at Scale」。

其中講到 2012 年的 Netflix Open Connect 中的 Open Connect Appliance (OCA,放伺服器到 ISP 機房的計畫) 只有單台伺服器 8Gbps,到現在 2016 可以達到 90Gbps:

As we mentioned in a recent company blog post, since the beginning of the Open Connect program we have significantly increased the efficiency of our OCAs - from delivering 8 Gbps of throughput from a single server in 2012 to over 90 Gbps from a single server in 2016.

早期的 Netflix 走 sendfile() 將影片丟出去,這在 kernel space 處理,所以很有效率:

當影片本身改走 HTTPS (TLS) 時,其中一個遇到的效能問題是導致 sendfile() 無法使用,而必須在 userland space 加密後改走回傳統的 write() 架構,這對於效能影響很大:

所以他們就讓 kernel 支援 AES 系列加密 (包括 AES-GCM 與 AES-CBC),效能的提昇大約是 30%:

Our changes in both the BoringSSL and ISA-L test situations significantly increased both CPU utilization and bandwidth over baseline - increasing performance by up to 30%, depending on the OCA hardware version.

文章開頭也有提到選 AES-GCM 與 AES-CBC 的一些來龍去脈,主要是 AES-GCM 的安全強度比較好,另外考慮到舊的 client 不支援 AES-GCM 時會使用 AES-CBC:

We evaluated available and applicable ciphers and decided to primarily use the Advanced Encryption Standard (AES) cipher in Galois/Counter Mode (GCM), available starting in TLS 1.2. We chose AES-CGM over the Cipher Block Chaining (CBC) method, which comes at a higher computational cost. The AES-GCM cipher algorithm encrypts and authenticates the message simultaneously - as opposed to AES-CBC, which requires an additional pass over the data to generate keyed-hash message authentication code (HMAC). CBC can still be used as a fallback for clients that cannot support the preferred method.

另外 OCA 機器本身也都夠新,支援 AES-NI 指令集,效能上不是太大的問題:

All revisions of Open Connect Appliances also have Intel CPUs that support AES-NI, the extension to the x86 instruction set designed to improve encryption and decryption performance. We needed to determine the best implementation of AES-GCM with the AES-NI instruction set, so we investigated alternatives to OpenSSL, including BoringSSL and the Intel Intelligent Storage Acceleration Library (ISA-L).

不過在「Netflix Open Connect Appliance Deployment Guide」(26 July 2016 版) 這份文件裡看起來還是用多條 10Gbps 透過 LACP 接上去:

You must be able to provision 2-4 x 10 Gbps ethernet ports in a LACP LAG per OCA. The exact quantity depends on the OCA type.

可能是下一版準備要上 40Gbps 或 100Gbps 的準備...?