如同 commit log 裡面提到的,這個功能會想要故意沒事就送一些沒用的資料 (增加一些噪音),降低從 side channel 被判讀的資訊量:
This attempts to hide inter-keystroke timings by sending interactive traffic at fixed intervals (default: every 20ms) when there is only a small amount of data being sent. It also sends fake "chaff" keystrokes for a random interval after the last real keystroke. These are controlled by a new ssh_config ObscureKeystrokeTiming keyword/
基於 OpenSSH 算是 SSH 這塊的 de-factor standard 了,接下來看其他家像是 Dropbear 會不會也實作?
Checked and the Mastercard one someone posted below doesn't seem to be vulnerable to this. My real card number and a dummy mastercard number with valid prefix and check digit both returned a 200 OK in around 1.01s. A random 16digit number without valid check digit returned 400 Bad Request in about 800ms. Decided to check that one since they have a completely useless machine-readable catchpa.
For Visa it was 835ms for valid, 762ms for dummy, prefix and check digit appears to be checked client side.
Spectre 的精華在於 CPU 支援 branch prediction 與 out-of-order execution,也就是 CPU 遇到 branch 時會學習怎麼跑,這個資訊提供給 out-of-order execution 就可以大幅提昇執行速度。可以參考以前在「CPU Branch Prediction 的成本...」提到的效率問題。
原理的部份可以看這段程式碼:
這類型程式碼常常出現在現代程式的各種安全檢查上:確認 x 沒問題後再實際將資料拉出來處理。而我們可以透過不斷的丟 x 值進去,讓 CPU 學到以為都是 TRUE,而在 CPU 學壞之後,突然丟進超出範圍的 x,產生 branch misprediction,但卻已經因為 out-of-order execution 而讓 CPU 執行過 y = ... 這段指令,進而導致 cache 的內容改變。
Suppose register R1 contains a secret value. If the speculatively executed memory read of array1[R1] is a cache hit, then nothing will go on the memory bus and the read from [R2] will initiate quickly. If the read of array1[R1] is a cache miss, then the second read may take longer, resulting in different timing for the victim thread.
所以相同道理,利用乘法器被佔用的 timing attack 也可以產生攻擊:
if (false but mispredicts as true)
multiply R1, R2
multiply R3, R4
In addition, of the three user-mode serializing instructions listed by Intel, only cpuid can be used in normal code, and it destroys many registers. The mfence and lfence (but not sfence) instructions also appear to work, with the added benefit that they do not destroy register contents. Their behavior with respect to speculative execution is not defined, however, so they may not work in all CPUs or system configurations.
However, we may manipulate its generation to control speculative execution while modifying the visible, on-stack value to direct how the branch is actually retired.
A unique feature of our work is that we target common cryptographic protocols. Previous works that demonstrate cache-timing key-recovery attack only target the cryptographic primitives, ignoring potential cache noise from the protocol implementation. In contrast, we present end-to-end attacks on two common cryptographic protocols: SSH and TLS. We are, therefore, the first to demonstrate that cache-timing attacks are a threat not only when executing the cryptographic primitives but also in the presence of the cache activity of the whole protocol suite.
而且次數相當的少,就可以 key recovery:
260 SSH-2 handshakes to extract a 1024/160-bit DSA host key from an OpenSSH server, and 580 TLS 1.2 handshakes to extract a 2048/256-bit DSA key from an stunnel server.
At the time of its release, Amazon announced that s2n had undergone three external security evaluations and penetration tests. We show that, despite this, s2n - as initially released - was vulnerable to a timing attack in the case of CBC-mode ciphersuites, which could be extended to complete plaintext recovery in some settings.
攻擊分成兩個階段:
Our attack has two components. The first part is a novel variant of the Lucky 13 attack that works even though protections against Lucky 13 were implemented in s2n. The second part deals with the randomised delays that were put in place in s2n as an additional countermeasure to Lucky 13. Our work highlights the challenges of protecting implementations against sophisticated timing attacks.
最後還是酸了一下 Amazon:
It also illustrates that standard code audits are insufficient to uncover all cryptographic attack vectors.
// Decrypt to plaintext + mac + padding
$plaintext_mac_padding = decrypt($ciphertext);
if (NULL != $plaintext_mac_padding) {
// Now decode padding part
$plaintext_mac = decode_padding($plaintext_mac_padding, $padding_length);
if (NULL != $plaintext_mac) {
// Now check MAC part
$plaintext = check_mac(plaintext_mac);
if (NULL != $plaintext) {
// Now it's okay
}
}
}
In general, the best way to do this is to compute the MAC even if the padding is incorrect, and only then reject the packet. For instance, if the pad appears to be incorrect, the implementation might assume a zero-length pad and then compute the MAC.
This leaves a small timing channel, since MAC performance depends to some extent on the size of the data fragment, but it is not believed to be large enough to be exploitable, due to the large block size of existing MACs and the small size of the timing signal.
The attacks apply to all TLS and DTLS implementations that are compliant with TLS 1.1 or 1.2, or with DTLS 1.0 or 1.2. They also apply to implementations of SSL 3.0 and TLS 1.0 that incorporate countermeasures to previous padding oracle attacks. Variant attacks may also apply to non-compliant implementations.