Netflix 對 sendfile() 在 TLS 情況下的加速

Netflix 對於寫了一篇關於隱私保護的技術細節:「Protecting Netflix Viewing Privacy at Scale」。

其中講到 2012 年的 Netflix Open Connect 中的 Open Connect Appliance (OCA,放伺服器到 ISP 機房的計畫) 只有單台伺服器 8Gbps,到現在 2016 可以達到 90Gbps:

As we mentioned in a recent company blog post, since the beginning of the Open Connect program we have significantly increased the efficiency of our OCAs - from delivering 8 Gbps of throughput from a single server in 2012 to over 90 Gbps from a single server in 2016.

早期的 Netflix 走 sendfile() 將影片丟出去,這在 kernel space 處理,所以很有效率:

當影片本身改走 HTTPS (TLS) 時,其中一個遇到的效能問題是導致 sendfile() 無法使用,而必須在 userland space 加密後改走回傳統的 write() 架構,這對於效能影響很大:

所以他們就讓 kernel 支援 AES 系列加密 (包括 AES-GCM 與 AES-CBC),效能的提昇大約是 30%:

Our changes in both the BoringSSL and ISA-L test situations significantly increased both CPU utilization and bandwidth over baseline - increasing performance by up to 30%, depending on the OCA hardware version.

文章開頭也有提到選 AES-GCM 與 AES-CBC 的一些來龍去脈,主要是 AES-GCM 的安全強度比較好,另外考慮到舊的 client 不支援 AES-GCM 時會使用 AES-CBC:

We evaluated available and applicable ciphers and decided to primarily use the Advanced Encryption Standard (AES) cipher in Galois/Counter Mode (GCM), available starting in TLS 1.2. We chose AES-CGM over the Cipher Block Chaining (CBC) method, which comes at a higher computational cost. The AES-GCM cipher algorithm encrypts and authenticates the message simultaneously - as opposed to AES-CBC, which requires an additional pass over the data to generate keyed-hash message authentication code (HMAC). CBC can still be used as a fallback for clients that cannot support the preferred method.

另外 OCA 機器本身也都夠新,支援 AES-NI 指令集,效能上不是太大的問題:

All revisions of Open Connect Appliances also have Intel CPUs that support AES-NI, the extension to the x86 instruction set designed to improve encryption and decryption performance. We needed to determine the best implementation of AES-GCM with the AES-NI instruction set, so we investigated alternatives to OpenSSL, including BoringSSL and the Intel Intelligent Storage Acceleration Library (ISA-L).

不過在「Netflix Open Connect Appliance Deployment Guide」(26 July 2016 版) 這份文件裡看起來還是用多條 10Gbps 透過 LACP 接上去:

You must be able to provision 2-4 x 10 Gbps ethernet ports in a LACP LAG per OCA. The exact quantity depends on the OCA type.

可能是下一版準備要上 40Gbps 或 100Gbps 的準備...?