We conducted extensive A/B testing of performance. Performance telemetry numbers are about the same for MSVC-built and clang-built Chrome – some metrics get better, some get worse, but all of them are within 5% of each other.
後面有猜測換過去的原因,可以看出因為 open source 加上 Google Chrome 團隊有強大的技術能力下,用 open source 軟體可以對 compiler 大量客製化各種功能,另外也是因為一個 compiler 就可以吃多個平台,可以省下一些跨平台的力氣 (像是相容性語法)。
而 Visual C++ 在商業支援與文件兩方面比較好的優勢,在這個情況下就顯得不是那麼重要了...
Spectre 的精華在於 CPU 支援 branch prediction 與 out-of-order execution,也就是 CPU 遇到 branch 時會學習怎麼跑,這個資訊提供給 out-of-order execution 就可以大幅提昇執行速度。可以參考以前在「CPU Branch Prediction 的成本...」提到的效率問題。
原理的部份可以看這段程式碼:
這類型程式碼常常出現在現代程式的各種安全檢查上:確認 x 沒問題後再實際將資料拉出來處理。而我們可以透過不斷的丟 x 值進去,讓 CPU 學到以為都是 TRUE,而在 CPU 學壞之後,突然丟進超出範圍的 x,產生 branch misprediction,但卻已經因為 out-of-order execution 而讓 CPU 執行過 y = ... 這段指令,進而導致 cache 的內容改變。
Suppose register R1 contains a secret value. If the speculatively executed memory read of array1[R1] is a cache hit, then nothing will go on the memory bus and the read from [R2] will initiate quickly. If the read of array1[R1] is a cache miss, then the second read may take longer, resulting in different timing for the victim thread.
所以相同道理,利用乘法器被佔用的 timing attack 也可以產生攻擊:
if (false but mispredicts as true)
multiply R1, R2
multiply R3, R4
In addition, of the three user-mode serializing instructions listed by Intel, only cpuid can be used in normal code, and it destroys many registers. The mfence and lfence (but not sfence) instructions also appear to work, with the added benefit that they do not destroy register contents. Their behavior with respect to speculative execution is not defined, however, so they may not work in all CPUs or system configurations.
However, we may manipulate its generation to control speculative execution while modifying the visible, on-stack value to direct how the branch is actually retired.
B3 is not yet complete. We still need to finish porting B3 to ARM64. B3 passes all tests, but we haven’t finished optimizing performance on ARM. Once all platforms that used the FTL switch to B3, we plan to remove LLVM support from the FTL JIT.
Android NDK 做為效能的加速手段而使用到 C 或是 C++,所以會使用對應的 compiler suite:
The NDK is a toolset that allows you to implement parts of your app using native-code languages such as C and C++. Typically, good use cases for the NDK are CPU-intensive applications such as game engines, signal processing, and physics simulation.
用統計方法去「猜測」這些 local function 與 local variable 應該叫什麼名字,讓人比較好理解。官方對準確度的說法是超過 60%:
In our experiments, we found JSNice to be effective for deobfuscating minified code. On average, more than 60% of the identifiers are recovered to the same name as before the minification process.
接下來會想辦法提供 UI 讓使用者可以選擇另外的名字:
Further, as JSNice computes multiple ranked suggestions, we provide a UI to navigate through these suggestions and select alternative identifier names.
The goal is to convert *their* C code (not all C code). They want generated code to be human-readable and maintainable. They want automatic conversion to handle 99+% of code.
第一波想要用機器轉換過去,而且要轉出可維護的程式碼。可以馬上想到的事情是,如果這件事情成功,代表現有軟體的 C code 也有機會轉移?