Home » Posts tagged "overhead"

不同性質的應用程式對 KPTI (Meltdown 修正) 的效能影響

NetflixBrendan Gregg 整理了他測試 KPTI 對效能的影響:「KPTI/KAISER Meltdown Initial Performance Regressions」。

與其他人只是概括的測試,他主要是想要針對可量測的數字對應出可能的 overhead,這樣一來還沒上 patch 的人就可以利用這些量測數字猜測可能的效能衝擊。


To understand the KPTI overhead, there are at least five factors at play. In summary:

  • Syscall rate: there are overheads relative to the syscall rate, although high rates are needed for this to be noticable. At 50k syscalls/sec per CPU the overhead may be 2%, and climbs as the syscall rate increases. At my employer (Netflix), high rates are unusual in cloud, with some exceptions (databases).
  • Context switches: these add overheads similar to the syscall rate, and I think the context switch rate can simply be added to the syscall rate for the following estimations.
  • Page fault rate: adds a little more overhead as well, for high rates.
  • Working set size (hot data): more than 10 Mbytes will cost additional overhead due to TLB flushing. This can turn a 1% overhead (syscall cycles alone) into a 7% overhead. This overhead can be reduced by A) pcid, available in Linux 4.14, and B) Huge pages.
  • Cache access pattern: the overheads are exacerbated by certain access patterns that switch from caching well to caching a little less well. Worst case, this can add an additional 10% overhead, taking (say) the 7% overhead to 17%.

重點在於給了量測的方式,以第一個 Syscall rate 來說好了,他用 sudo perf stat -e raw_syscalls:sys_enter -a -I 1000 測試而得到程式的 syscall 數量,然後得到下面的表格,其中 X 軸是每秒千次呼叫數,Y 軸是效能損失:

用這樣的方式提供給整個組織 (i.e. Netflix) 內評估衝擊。

印 "#" 比印 "B" 來的快的問題

這篇是兩年前在 StackOverflow 上的問題:「Why is printing “B” dramatically slower than printing “#”?」。

問問題的人這段程式跑了 8.52 秒:

Random r = new Random();
for (int i = 0; i < 1000; i++) {
    for (int j = 0; j < 1000; j++) {
        if(r.nextInt(4) == 0) {
        } else {


而把上面的 # 換成 B 就變成 259.152 秒。

答案是與 word-wrapping 有關:

Pure speculation is that you're using a terminal that attempts to do word-wrapping rather than character-wrapping, and treats B as a word character but # as a non-word character. So when it reaches the end of a line and searches for a place to break the line, it sees a # almost immediately and happily breaks there; whereas with the B, it has to keep searching for longer, and may have more text to wrap (which may be expensive on some terminals, e.g., outputting backspaces, then outputting spaces to overwrite the letters being wrapped).

But that's pure speculation.

這真是細節 XDDD