DigitalOcean 總算是跟上來了...

DigitalOcean 在隔壁 Linode 都升級一年半年後 (參考 2016 年的文章「Linode 記憶體升級,以及新的日本機房計畫」這篇),才想到沒有競爭優勢了:「Kicking Off the New Year with New Droplet Plans」。

在 Standard Droplets 的部份,主要就是把記憶體的量補上來,另外提供一些變化:

另外 Optimized Droplets 也有一些變化,不過就沒有太關注了...

離開 DigitalOcean 很久了 (現在主力都是 Linode 跟 Vultr 了),這次端出來的菜盤看起來還是不太行...

Ubuntu 開始更新 Kernel 了...

這波 CPU 安全問題,UbuntuLinux Kernel 的更新計畫 (workaround patch) 放在「Information Leak via speculative execution side channel attacks (CVE-2017-5715, CVE-2017-5753, CVE-2017-5754 aka Spectre and Meltdown)」這邊。

不是所有版本的 kernel 都有更新,像是我之前跑 4.10 發現這次沒在清單內,就換成 linux-image-generic-hwe-16.04-edge 了... 換完後需要再裝 linux-headers-generic-hwe-16.04-edge,然後把舊的 kernel 都清乾淨,最後 nvidia-387 需要重新編過。

這次苦哈哈啊...

Intel CPU + AMD GPU 合一的的系統

先前就有看到 Intel 要與 AMD 合作,將 Intel CPU + AMD GPU 整合在一起以對抗 Nvidia,現在看到 HP 推出對應的筆電了:「HP’s new 15-inch Spectre x360 uses the hybrid Intel/AMD processor」。

不過名字剛好跟最近的安全漏洞撞到了 XDDD (所以才想寫 XDDD)

The new Spectre x360 15 is one of the first systems to be announced that uses the new Kaby Lake-G processors from Intel. These processors combine an Intel CPU (with its own integrated GPU) with an AMD GPU, all within a single package.


出自「Kaby Lake-G unveiled: Intel CPU, AMD GPU, Nvidia-beating performance」。

這種合作的仗打不打的動呢... 不怎麼看好就是了 :o

讀書時間:Meltdown 的攻擊方式

Meltdown 的論文可以在「Meltdown (PDF)」這邊看到。這個漏洞在 Intel 的 CPU 上影響最大,而在 AMD 是不受影響的。其他平台有零星的消息,不過不像 Intel 是這十五年來所有的 CPU 都中獎... (從 Pentium 4 以及之後的所有 CPU)

Meltdown 是基於這些前提,而達到記憶體任意位置的 memory dump:

  • 支援 µOP 方式的 out-of-order execution 以及當失敗時的 rollback 機制。
  • 因為 cache 機制造成的 side channel information leak。
  • 在 out-of-order execution 時對記憶體存取的 permission check 失效。

out-of-order execution 在大學時的計算機組織應該都會提到,不過我印象中當時只講「在確認不相干的指令才會有 out-of-order」。而現代 CPU 做的更深入,包括了兩個部份:

  • 第一個是 µOP 方式,將每個 assembly 拆成更細的 micro-operation,後面的 out-of-order execution 是對 µOP 做。
  • 第二個是可以先執行下去,如果發現搞錯了再 rollback。

像是下面的 access() 理論上不應該被執行到,但現代的 out-of-order execution 會讓 CPU 有機會先跑後面的指令,最後發現不該被執行到後,再將 register 與 memory 的資料 rollback 回來:

而 Meltdown 把後面不應該執行到 code 放上這段程式碼 (這是 Intel syntax assembly):

其中 mov al, byte [rcx] 應該要做記憶體檢查,確認使用者是否有權限存取那個位置。但這邊因為連記憶體檢查也拆成 µOP 平行跑,而產生 race condition:

Meltdown is some form of race condition between the fetch of a memory address and the corresponding permission check for this address.

而這導致後面這段不該被執行到的程式碼會先讀到資料放進 al register 裡。然後再去存取某個記憶體位置造成某塊記憶體位置被讀到 cache 裡。

造成 cache 內的資料改變後,就可以透過 FLUSH+RELOAD 技巧 (side channel) 而得知這段程式碼讀了哪一塊資料 (參考之前寫的「Meltdown 與 Spectre 都有用到的 FLUSH+RELOAD」),於是就能夠推出 al 的值...

而 Meltdown 在 mov al, byte [rcx] 這邊之所以可以成立,另外一個需要突破的地方是 [rcx]。這邊 [rcx] 存取時就算沒有權限檢查,在 virtual address 轉成 physical address 時應該會遇到問題?

原因是 LinuxOS X 上有 direct-physical map 的機制,會把整塊 physical memory 對應到 virtual memory 的固定位置上,這些位置不會再發給 user space 使用,所以是通的:

On Linux and OS X, this is done via a direct-physical map, i.e., the entire physical memory is directly mapped to a pre-defined virtual address (cf. Figure 2).

而在 Windows 上則是比較複雜,但大部分的 physical memory 都有對應到 kernel address space,而每個 process 裡面也都還是有完整的 kernel address space (只是受到權限控制),所以 Meltdown 的攻擊仍然有效:

Instead of a direct-physical map, Windows maintains a multiple so-called paged pools, non-paged pools, and the system cache. These pools are virtual memory regions in the kernel address space mapping physical pages to virtual addresses which are either required to remain in the memory (non-paged pool) or can be removed from the memory because a copy is already stored on the disk (paged pool). The system cache further contains mappings of all file-backed pages. Combined, these memory pools will typically map a large fraction of the physical memory into the kernel address space of every process.

這也是 workaround patch「Kernel page-table isolation」的原理 (看名字大概就知道方向了),藉由將 kernel 與 user 的區塊拆開來打掉 Meltdown 的攻擊途徑。

而 AMD 的硬體則是因為 mov al, byte [rcx] 這邊權限的檢查並沒有放進 out-of-order execution,所以就避開了 Meltdown 攻擊中很重要的一環。

Spectre 與 Meltdown 兩套 CPU 的安全漏洞

The Register 發表了「Kernel-memory-leaking Intel processor design flaw forces Linux, Windows redesign」這篇文章,算是頗完整的說明了這次的安全漏洞 (以 IT 新聞媒體標準來看),引用了蠻多資料並且試著說明問題。

而這也使得整個事情迅速發展與擴散超出本來的預期,使得 GoogleProject Zero 提前公開發表了 Spectre 與 Meltdown 這兩套 CPU 安全漏洞。文章非常的長,描述的也比 The Register 那篇還完整:「Reading privileged memory with a side-channel」。

在 Google Project Zero 的文章裡面,把這些漏洞分成三類,剛好依據 CVE 編號分開描述:

  • Variant 1: bounds check bypass (CVE-2017-5753)
  • Variant 2: branch target injection (CVE-2017-5715)
  • Variant 3: rogue data cache load (CVE-2017-5754)

前兩個被稱作 Spectre,由 Google Project Zero、Cyberus Technology 以及 Graz University of Technology 三個團隊獨立發現並且回報原廠。後面這個稱作 Meltdown,由 Google Project Zero 與另外一個團隊獨立發現並且回報原廠。

這兩套 CPU 的安全漏洞都有「官網」,網址不一樣但內容一樣:spectreattack.commeltdownattack.com

影響範圍包括 IntelAMD 以及 ARM,其中 AMD 因為架構不一樣,只有在特定的情況下會中獎 (在使用者自己打開 eBPF JIT 後才會中):

(提到 Variant 1 的情況) If the kernel's BPF JIT is enabled (non-default configuration), it also works on the AMD PRO CPU.

這次的洞主要試著透過 side channel 資訊讀取記憶體內容 (會有一些條件限制),而痛點在於修正 Meltdown 的方式會有極大的 CPU 效能損失,在 Linux 上對 Meltdown 的修正的資訊可以參考「KAISER: hiding the kernel from user space」這篇,裡面提到:

KAISER will affect performance for anything that does system calls or interrupts: everything. Just the new instructions (CR3 manipulation) add a few hundred cycles to a syscall or interrupt. Most workloads that we have run show single-digit regressions. 5% is a good round number for what is typical. The worst we have seen is a roughly 30% regression on a loopback networking test that did a ton of syscalls and context switches.

KAISER 後來改名為 KPTI,查資料的時候可以注意一下。

不過上面提到的是實體機器,在 VM 裡面可以預期會有更多 syscall 與 context switch,於是 Phoronix 測試後發現在 VM 裡效能的損失比實體機器大很多 (還是跟應用有關,主要看應用會產生多少 syscall 與 context switch):「VM Performance Showing Mixed Impact With Linux 4.15 KPTI Patches」。

With these VM results so far it's still a far cry from the "30%" performance hit that's been hyped up by some of the Windows publications, etc. It's still highly dependent upon the particular workload and system how much performance may be potentially lost when enabling page table isolation within the kernel.

這對各家 cloud service 不是什麼好消息,如果效能損失這麼大,不太可能直接硬上 KPTI patch... 尤其是 VPS,對於平常就會 oversubscription 的前提下,KPTI 不像是可行的方案。

可以看到各 VPS 都已經發 PR 公告了 (先發個 PR 稿說我們有在注意,但都還沒有提出解法):「CPU Vulnerabilities: Meltdown & Spectre (Linode)」、「A Message About Intel Security Findings (DigitalOcean)」、「Intel CPU Vulnerability Alert (Vultr)」。

現在可以預期會有更多人投入研究,要怎麼樣用比較少的 performance penalty 來抵抗這兩套漏洞,現在也只能先等了...

Amazon EC2 推出第一款 Bare Metal 的 Instance

Amazon EC2 直接租整台主機出來了:「Amazon EC2 Bare Metal Instances with Direct Access to Hardware」。

Bare Metal 怎麼翻譯比較好啊?雖然知道是拔掉虛擬化的主機... 裸奔機?

We knew that other customers also had interesting use cases for bare metal hardware and didn’t want to take the performance hit of nested virtualization. They wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.

反正這種機器就是要壓榨整台機器的效能,所以不會拿小台機器出來給大家玩。這次推出的是 i3 系列,叫做 i3.metal

Today we are launching a public preview the i3.metal instance, the first in a series of EC2 instances that offer the best of both worlds, allowing the operating system to run directly on the underlying hardware while still providing access to all of the benefits of the cloud. The instance gives you direct access to the processor and other hardware, and has the following specifications:

Processing – Two Intel Xeon E5-2686 v4 processors running at 2.3 GHz, with a total of 36 hyperthreaded cores (72 logical processors).
Memory – 512 GiB.
Storage – 15.2 terabytes of local, SSD-based NVMe storage.
Network – 25 Gbps of ENA-based enhanced networking.

走了十年總算走到這塊了... 不過應該花了不少時間解決各種安全性的問題,像是 network isolation 以及反刷韌體的問題 XD

便宜的 HSM

當然速度就不用想太多了...

一個是 Yubico 推出的 YubiHSM 2:「YubiHSM 2 is here: Providing root of trust for servers and computing devices」。

另外一個是 Mozilla 在 2013 年提到的 CryptoStick,不過現在連過去看到的是 Nitrokey HSM:「Using CryptoStick as an HSM」。

兩個都是走 USB 1.1 Type A,運算效能都普普通通,感覺自己用比較合適?像是 GnuPG 加解密。拿給線上服務用的效能還是要夠好...

U2F Security Key 產品測試?

Adam Langley 的「Testing Security Keys」這篇測試了不少有支援 U2F Security Key 的產品,這邊作者是以 Linux 環境測試。

tl;dr:在 Linux 環境下,除了 Yubico 的產品沒問題外,其他的都有問題... (只是差在問題多與少而已)

Yubico 的沒找到問題:

Easy one first: I can find no flaws in Yubico's U2F Security Key.

VASCO SecureClick 的則是 vendor ID 與 product ID 會跑掉:

If you're using Linux and you configure udev to grant access to the vendor ID & product ID of the token as it appears normally, nothing will work because the vendor ID and product ID are different when it's active. The Chrome extension will get very confused about this.

Feitian ePass 的 ASN.1 DER 編碼是錯的:

ASN.1 DER is designed to be a “distinguished” encoding, i.e. there should be a unique serialisation for a given value and all other representations are invalid. As such, numbers are supposed to be encoded minimally, with no leading zeros (unless necessary to make a number positive). Feitian doesn't get that right with this security key: numbers that start with 9 leading zero bits have an invalid zero byte at the beginning. Presumably, numbers starting with 17 zero bits have two invalid zero bytes at the beginning and so on, but I wasn't able to press the button enough times to get such an example. Thus something like one in 256 signatures produced by this security key are invalid.

Thetis 根本沒照 spec 走,然後也有相同的 ASN.1 DER 編碼問題:

With this device, I can't test things like key handle mutability and whether the appID is being checked because of some odd behaviour. The response to the first Check is invalid, according to the spec: it returns status 0x9000 (“NO_ERROR”), when it should be 0x6985 or 0x6a80. After that, it starts rejecting all key handles (even valid ones) with 0x6a80 until it's unplugged and reinserted.

This device has the same non-minimal signature encoding issue as the Feitian ePass. Also, if you click too fast, this security key gets upset and rejects a few requests with status 0x6ffe.

U2F Zero 直接 crash 沒辦法測 XDDD:

A 1KiB ping message crashes this device (i.e. it stops responding to USB messages and needs to be unplugged and reinserted). Testing a corrupted key handle also crashes it and thus I wasn't able to run many tests.

KEY-ID (網站連 HTTPS 都沒上...) / HyperFIDO 也有編碼問題但更嚴重:

The Key-ID (and HyperFIDO devices, which have the same firmware, I think) have the same non-minimal encoding issue as the Feitian ePass, but also have a second ASN.1 flaw. In ASN.1 DER, if the most-significant bit of a number is set, that number is negative. If it's not supposed to be negative, then a zero pad byte is needed. I think what happened here is that, when testing the most-significant bit, the security key checks whether the first byte is > 0x80, but it should be checking whether it's >= 0x80. The upshot is the sometimes it produces signatures that contain negative numbers and are thus invalid.

所以還是乖乖用 GitHub 帳號買 Yubico 吧...

最新的 Firefox 56 對 AES-GCM 效能的改善

昨天釋出的 Firefox 56 對於 AES-GCM 在老電腦上改善了不少效能:「Improving AES-GCM Performance」。

首先是 Firefox 自己的數據分析,可以看到 AES-GCM 佔目前加密連線裡的大宗,再來是 AES-CBC:

先以 Linux 64bits 環境的數據來看,Firefox 56 的 NSS 3.32 大幅改善了老電腦的效能 (不支援 AES-NI 硬體加解密的 CPU,甚至是不支援 PCLMUL 的 CPU,以及不支援 AVX 的 CPU):

在 Linux 32bits 環境上則是連預設值大幅改善,不過用的人應該少很多了:

Windows 下則是因為 64bits 或是 32bits 都有足夠的使用者,所以平常就花了不少力氣。但也可以看出對於老電腦的速度提升:

Mac (64bits only) 算是這次比較大的提升,連新電腦的預設值都大幅變快:

加上之後陸續的改善 (尤其是下一版 Firefox 57 的 Project Quantum),這幾版應該會拉出不少效能...

AWS CloudHSM 支援 FIPS 140-2 Level 3 了

AWS CloudHSM 推出了一些新功能:「AWS CloudHSM Update – Cost Effective Hardware Key Management at Cloud Scale for Sensitive & Regulated Workloads」。

其中比較特別的是從以前只支援 Level 2 變成支援 Level 3 了:

More Secure – CloudHSM Classic (the original model) supports the generation and use of keys that comply with FIPS 140-2 Level 2. We’re stepping that up a notch today with support for FIPS 140-2 Level 3, with security mechanisms that are designed to detect and respond to physical attempts to access or modify the HSM.

在維基百科裡面有提到 Level 2 與 Level 3 的要求:

Security Level 2 improves upon the physical security mechanisms of a Security Level 1 cryptographic module by requiring features that show evidence of tampering, including tamper-evident coatings or seals that must be broken to attain physical access to the plaintext cryptographic keys and critical security parameters (CSPs) within the module, or pick-resistant locks on covers or doors to protect against unauthorized physical access.

In addition to the tamper-evident physical security mechanisms required at Security Level 2, Security Level 3 attempts to prevent the intruder from gaining access to CSPs held within the cryptographic module. Physical security mechanisms required at Security Level 3 are intended to have a high probability of detecting and responding to attempts at physical access, use or modification of the cryptographic module. The physical security mechanisms may include the use of strong enclosures and tamper-detection/response circuitry that zeroes all plaintext CSPs when the removable covers/doors of the cryptographic module are opened.

主動式偵測以及銷毀算是 Level 3 比 Level 2 安全的地方。

另外就是計價方式的修正,先前有一筆固定的費用,現在變成完全照小時計費了:

Pay As You Go – CloudHSM is now offered under a pay-as-you-go model that is simpler and more cost-effective, with no up-front fees.