Home » Computer » Archive by category "Security" (Page 2)

Fortnite 看起來沒上 Auto Scaling?(或是沒正確設好?)

Fortnite 遊戲的伺服器放在 AWS 上,看起來這波 Meltdown 的安全更新 (KPTI) 造成非常大的 overhead:

不過看起來出了問題:

We wanted to provide a bit more context for the most recent login issues and service instability. All of our cloud services are affected by updates required to mitigate the Meltdown vulnerability. We heavily rely on cloud services to run our back-end and we may experience further service issues due to ongoing updates.

最有可能的是把 AWS 當作一般的 VPS 在用,另外一種可能是有部份內部服務沒有 scale,造成上了 KPTI 後 overhead 增加,就卡住了...

讀書時間:Meltdown 的攻擊方式

Meltdown 的論文可以在「Meltdown (PDF)」這邊看到。這個漏洞在 Intel 的 CPU 上影響最大,而在 AMD 是不受影響的。其他平台有零星的消息,不過不像 Intel 是這十五年來所有的 CPU 都中獎... (從 Pentium 4 以及之後的所有 CPU)

Meltdown 是基於這些前提,而達到記憶體任意位置的 memory dump:

  • 支援 µOP 方式的 out-of-order execution 以及當失敗時的 rollback 機制。
  • 因為 cache 機制造成的 side channel information leak。
  • 在 out-of-order execution 時對記憶體存取的 permission check 失效。

out-of-order execution 在大學時的計算機組織應該都會提到,不過我印象中當時只講「在確認不相干的指令才會有 out-of-order」。而現代 CPU 做的更深入,包括了兩個部份:

  • 第一個是 µOP 方式,將每個 assembly 拆成更細的 micro-operation,後面的 out-of-order execution 是對 µOP 做。
  • 第二個是可以先執行下去,如果發現搞錯了再 rollback。

像是下面的 access() 理論上不應該被執行到,但現代的 out-of-order execution 會讓 CPU 有機會先跑後面的指令,最後發現不該被執行到後,再將 register 與 memory 的資料 rollback 回來:

而 Meltdown 把後面不應該執行到 code 放上這段程式碼 (這是 Intel syntax assembly):

其中 mov al, byte [rcx] 應該要做記憶體檢查,確認使用者是否有權限存取那個位置。但這邊因為連記憶體檢查也拆成 µOP 平行跑,而產生 race condition:

Meltdown is some form of race condition between the fetch of a memory address and the corresponding permission check for this address.

而這導致後面這段不該被執行到的程式碼會先讀到資料放進 al register 裡。然後再去存取某個記憶體位置造成某塊記憶體位置被讀到 cache 裡。

造成 cache 內的資料改變後,就可以透過 FLUSH+RELOAD 技巧 (side channel) 而得知這段程式碼讀了哪一塊資料 (參考之前寫的「Meltdown 與 Spectre 都有用到的 FLUSH+RELOAD」),於是就能夠推出 al 的值...

而 Meltdown 在 mov al, byte [rcx] 這邊之所以可以成立,另外一個需要突破的地方是 [rcx]。這邊 [rcx] 存取時就算沒有權限檢查,在 virtual address 轉成 physical address 時應該會遇到問題?

原因是 LinuxOS X 上有 direct-physical map 的機制,會把整塊 physical memory 對應到 virtual memory 的固定位置上,這些位置不會再發給 user space 使用,所以是通的:

On Linux and OS X, this is done via a direct-physical map, i.e., the entire physical memory is directly mapped to a pre-defined virtual address (cf. Figure 2).

而在 Windows 上則是比較複雜,但大部分的 physical memory 都有對應到 kernel address space,而每個 process 裡面也都還是有完整的 kernel address space (只是受到權限控制),所以 Meltdown 的攻擊仍然有效:

Instead of a direct-physical map, Windows maintains a multiple so-called paged pools, non-paged pools, and the system cache. These pools are virtual memory regions in the kernel address space mapping physical pages to virtual addresses which are either required to remain in the memory (non-paged pool) or can be removed from the memory because a copy is already stored on the disk (paged pool). The system cache further contains mappings of all file-backed pages. Combined, these memory pools will typically map a large fraction of the physical memory into the kernel address space of every process.

這也是 workaround patch「Kernel page-table isolation」的原理 (看名字大概就知道方向了),藉由將 kernel 與 user 的區塊拆開來打掉 Meltdown 的攻擊途徑。

而 AMD 的硬體則是因為 mov al, byte [rcx] 這邊權限的檢查並沒有放進 out-of-order execution,所以就避開了 Meltdown 攻擊中很重要的一環。

Meltdown 與 Spectre 都有用到的 FLUSH+RELOAD

MeltdownSpectre 攻擊裡都有用到的 FLUSH+RELOAD 技巧。這個技巧是出自於 2013 年的「Flush+Reload: a High Resolution, Low Noise, L3 Cache Side-Channel Attack」。當時還因此對 GnuPG 發了一個 CVE-2013-4242

FLUSH+RELOAD 是希望透過 shared memory & cache 得到 side channel information,藉此突破安全機制。

論文裡面提到兩個攻擊模式,一種是在同一個 OS 裡面 (same-OS),另外一種是在同一台機器,但是是兩個不同的 VM (cross-VM)。攻擊的前提是要拿到與 GnuPG process 相同的 shared memory。兩個環境的作法都是透過 mmap() GnuPG 的執行檔以取得 shared memory。

在 same-OS 的情況下會使用同一個 process:

To achieve sharing, the spy mmaps the victim’s executable file into the spy’s virtual address space. As the Linux loader maps executable files into the process when executing them, the spy and the victim share the memory image of the mapped file.

在 cross-VM 的情況下會因為 hypervisor 會 dedup 而產生 shared memory:

For the cross-VM scenario we used two different hypervisors: VMware ESXi 5.1 on the HP machine and Centos 6.5 with KVM on the Dell machine. In each hypervisor we created two virtual machines, one for the victim and the other for the spy. The virtual machines run CentOS 6.5 Linux. In this scenario, the spy mmaps a copy of the victim’s executable file. Sharing is achieved through the page de-duplication mechanisms of the hypervisors.

接下來就能夠利用 cache 表演了。基本原理是「存取某一塊記憶體內容,然後計算花了多久取得,就能知道這次存取是從 L1、L2、L3 還是記憶體取得」。所以 FLUSH+RELOAD 就設計了三個步驟:

  • During the first phase, the monitored memory line is flushed from the cache hierarchy.
  • The spy, then, waits to allow the victim time to access the memory line before the third phase.
  • In the third phase, the spy reloads the memory line, measuring the time to load it.

先 flush 掉要觀察的記憶體位置 (用 clflush),然後等待一小段時間,接著掃記憶體區塊,透過時間得知有哪些被存取過 (就會比較快)。這邊跟 cache 架構有關,你不能想要偷看超過 cache 大小的量 (這樣會被 purge 出去),所以通常是盯著關鍵的部份就好。

接著是要搞 GnuPG,先看他在使用 RSA private key 計算的程式碼:

而依照這段程式碼挑好位置觀察後,就開始攻擊收資訊。隨著時間變化就可以看到這樣的資訊:

然後可以觀察出執行的順序:

於是就能夠依照執行順序推敲出 RSA key 了,而實際測試的成果是這樣,在一次的 decrypt 或是 sign 就把 RSA key 還原的差不多了 (96.7%):

We demonstrate the efficacy of the FLUSH+RELOAD attack by using it to extract the private encryption keys from a victim program running GnuPG 1.4.13. We tested the attack both between two unrelated processes in a single operating system and between processes running in separate virtual machines. On average, the attack is able to recover 96.7% of the bits of the secret key by observing a single signature or decryption round.

知道了這個方法後,看 Meltdown 或是 Spectre 才會知道他們用 FLUSH+RELOAD 的原因... (因為在 Meltdown 與 Spectre 裡面就只有帶過去)

Intel CEO 做的真不錯 XDDD

在發生爆發前一個月把自家 Intel 的股票賣到最低限度 XDDD:「Intel was aware of the chip vulnerability when its CEO sold off $24 million in company stock」,引用的新聞是「Intel's CEO Just Sold a Lot of Stock」:

On Nov. 29, Brian Krzanich, the CEO of chip giant Intel (NASDAQ:INTC), reported several transactions in Intel stock in a Form 4 filing with the SEC.

所以十一月底的時候賣掉... 只保留 CEO 最低限額 250 張:

Those two transactions left Krzanich with exactly 250,000 shares -- the bare minimum that he's required to hold as CEO.

來看看獲利會不會被追回 XDDD

Linus (又) 不爽了... XD

看得出來 Linus 對於 Intel 的行為很不爽:「Re: Avoid speculative indirect calls in kernel」。

Please talk to management. Because I really see exactly two possibibilities:

 - Intel never intends to fix anything

OR

 - these workarounds should have a way to disable them.

Which of the two is it?

那個 possibibilities 應該是 typo,但不知道為什麼看起來很有味道 XDDD

Spectre 與 Meltdown 兩套 CPU 的安全漏洞

The Register 發表了「Kernel-memory-leaking Intel processor design flaw forces Linux, Windows redesign」這篇文章,算是頗完整的說明了這次的安全漏洞 (以 IT 新聞媒體標準來看),引用了蠻多資料並且試著說明問題。

而這也使得整個事情迅速發展與擴散超出本來的預期,使得 GoogleProject Zero 提前公開發表了 Spectre 與 Meltdown 這兩套 CPU 安全漏洞。文章非常的長,描述的也比 The Register 那篇還完整:「Reading privileged memory with a side-channel」。

在 Google Project Zero 的文章裡面,把這些漏洞分成三類,剛好依據 CVE 編號分開描述:

  • Variant 1: bounds check bypass (CVE-2017-5753)
  • Variant 2: branch target injection (CVE-2017-5715)
  • Variant 3: rogue data cache load (CVE-2017-5754)

前兩個被稱作 Spectre,由 Google Project Zero、Cyberus Technology 以及 Graz University of Technology 三個團隊獨立發現並且回報原廠。後面這個稱作 Meltdown,由 Google Project Zero 與另外一個團隊獨立發現並且回報原廠。

這兩套 CPU 的安全漏洞都有「官網」,網址不一樣但內容一樣:spectreattack.commeltdownattack.com

影響範圍包括 IntelAMD 以及 ARM,其中 AMD 因為架構不一樣,只有在特定的情況下會中獎 (在使用者自己打開 eBPF JIT 後才會中):

(提到 Variant 1 的情況) If the kernel's BPF JIT is enabled (non-default configuration), it also works on the AMD PRO CPU.

這次的洞主要試著透過 side channel 資訊讀取記憶體內容 (會有一些條件限制),而痛點在於修正 Meltdown 的方式會有極大的 CPU 效能損失,在 Linux 上對 Meltdown 的修正的資訊可以參考「KAISER: hiding the kernel from user space」這篇,裡面提到:

KAISER will affect performance for anything that does system calls or interrupts: everything. Just the new instructions (CR3 manipulation) add a few hundred cycles to a syscall or interrupt. Most workloads that we have run show single-digit regressions. 5% is a good round number for what is typical. The worst we have seen is a roughly 30% regression on a loopback networking test that did a ton of syscalls and context switches.

KAISER 後來改名為 KPTI,查資料的時候可以注意一下。

不過上面提到的是實體機器,在 VM 裡面可以預期會有更多 syscall 與 context switch,於是 Phoronix 測試後發現在 VM 裡效能的損失比實體機器大很多 (還是跟應用有關,主要看應用會產生多少 syscall 與 context switch):「VM Performance Showing Mixed Impact With Linux 4.15 KPTI Patches」。

With these VM results so far it's still a far cry from the "30%" performance hit that's been hyped up by some of the Windows publications, etc. It's still highly dependent upon the particular workload and system how much performance may be potentially lost when enabling page table isolation within the kernel.

這對各家 cloud service 不是什麼好消息,如果效能損失這麼大,不太可能直接硬上 KPTI patch... 尤其是 VPS,對於平常就會 oversubscription 的前提下,KPTI 不像是可行的方案。

可以看到各 VPS 都已經發 PR 公告了 (先發個 PR 稿說我們有在注意,但都還沒有提出解法):「CPU Vulnerabilities: Meltdown & Spectre (Linode)」、「A Message About Intel Security Findings (DigitalOcean)」、「Intel CPU Vulnerability Alert (Vultr)」。

現在可以預期會有更多人投入研究,要怎麼樣用比較少的 performance penalty 來抵抗這兩套漏洞,現在也只能先等了...

2017 年 CA/Browser Forum 在台北辦的見面會議的會議記錄出爐了...

2017 年 CA/Browser Forum 在台北舉辦的見面會議,會議記錄總算是出爐了:「2017-10-04 Minutes of Face-to-Face Meeting 42 in Taipei - CAB Forum」。

由於是辦在台北,所以台灣很多單位都有出席,像是中央警察大學 (1)、中華電信 (11)、日盛聯合會計師事務所 (1)、TWCA (3):

Attendance: Peter Bowen (Amazon); Geoff Keating and Curt Spann (Apple); Jeremy Shen (Central Police University); Franck Leroy (Certinomis / Docapost); Wayne Chan and Sing-man Ho (Certizen Limited); Wen-Cheng Wang, Bon-Yeh Lin, Wen-Chun Yang, Jenhao Ou, Wei-Hao Tung, Chiu-Yun Chuang, Chung-Chin Hsiao, Chin-Fu Huang, Li-Chun Chen, Pin-Jung Chiang, and Wen-Hui Tsai (Chunghwa Telecom); Alex Wight and JP Hamilton (Cisco), Robin Alden (Comodo), Gord Beal (CPA Canada), Ben Wilson and Jeremy Rowley (DigiCert), Arno Fiedler and Enrico Entschew (D-TRUST); Kirk Hall (Entrust Datacard); Ou Jingan, Zhang Yongqiang, and Xiu Lei (GDCA); Atsushi Inaba and Giichi Ishii (GlobalSign); Wayne Thayer (GoDaddy); Devon O’Brien (Google); David Hsiu (KPMG); Mike Reilly (Microsoft); Gervase Markham and Aaron Wu (Mozilla); Hoang Trung La (National Electronic Authentication Center (NEAC) of Vietnam); Tadahiko Ito (Secom Trust Systems); Leo Grove and Fotis Loukos (SSL.com); Brian Hsiung (Sunrise CPA Firm); Steve Medin (Symantec); Frank Corday and Tim Hollebeek (Trustwave); Robin Lin, David Chen, and Huang Fu Yen (TWCA); and Don Sheehy and Jeff Ward (WebTrust).

開頭有提到會議記錄 delay 的情況:

Preliminary Note: The CA/Browser Forum was delayed in completing the minutes for its last Face-to-Face meeting Oct. 4-5, 2017 in Taipei, and the proposed final Minutes were only sent by the Chair to the Members on December 13, 2017 for their review. There was not enough time for Members to review the draft before the next teleconference of December 14, and the teleconference of December 28 was cancelled due to the holidays. The next Forum teleconference is scheduled for January 11, 2018.

會議記錄很長,主要是有不少主題被拿到見面會議上討論,另外有一半的篇幅是在說明各家 root program policy 的變化。

下次的見面會議會在三月,然後會由 Amazon 辦在東岸:

Peter confirmed the next F2F meeting will be hosted by Amazon on March 6-8, 2018 at its Herndon, Virginia location. More information will be provided in the coming months.

也是拿來掃 PHP 程式碼的 PHPStan...

PHPStan 也是 PHP 的靜態分析工具,官方給的 slogan 是「PHP Static Analysis Tool - discover bugs in your code without running it!」。然後官方給了一個 GIF,直接看就大概知道在幹什麼了:

Phan 類似,也是要 PHP 7+ 才能跑,不過實際測試發現不像 Phan 需要 php-ast

PHPStan requires PHP ^gt;= 7.0. You have to run it in environment with PHP 7.x but the actual code does not have to use PHP 7.x features. (Code written for PHP 5.6 and earlier can run on 7.x mostly unmodified.)

PHPStan works best with modern object-oriented code. The more strongly-typed your code is, the more information you give PHPStan to work with.

Properly annotated and typehinted code (class properties, function and method arguments, return types) helps not only static analysis tools but also other people that work with the code to understand it.

拿上一篇「用 Phan 檢查 PHP 程式的正確性」的例子測試,也可以抓到類似的問題:

vendor/bin/phpstan analyse -l 7 src/
 1/1 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%

 ------ --------------------------------------------------------------
  Line   src/Foo.php
 ------ --------------------------------------------------------------
  13     Method Gslin\Foo::g() should return string but returns null.
 ------ --------------------------------------------------------------


 [ERROR] Found 1 error

這樣總算把積壓在 tab 上關於 PHP 工具都寫完了,之後要用才有地方可以翻... XD

用 Phan 檢查 PHP 程式的正確性

Phan 這套也是拿來檢查 PHP 程式用的,也是儘量避免丟出 false alarm。不過 Phan 只能用在 PHP 7+ 環境,原因是使用 php-ast,另外有一些額外建議要裝的套件:

This version (branch) of Phan depends on PHP 7.1.x with the php-ast extension (0.1.5 or newer, uses AST version 50) and supports PHP version 7.1+ syntax. Installation instructions for php-ast can be found here. For PHP 7.0.x use the 0.8 branch. Having PHP's pcntl extension installed is strongly recommended (not available on Windows), in order to support using parallel processes for analysis (or to support daemon mode).

最新版還只能跑在 PHP 7.2 上面,用的時候要注意一下 XD (我在測試時,require-dev 指定 0.11.0,結果被說只有 PHP 7.1 不符合 dependency,後來放 * 讓他去抓適合的版本)

像是這樣的程式碼:

class Foo
{
    /**
     * @param string $p
     * @return string
     */    function g($p) {
        if (!$p) {
            return null;
        }
        return $p;
    }
}

就會產生出對應的警告訊息:

src/Foo.php:13 PhanTypeMismatchReturn Returning type null but g() is declared to return string

也是掛進 CI 裡面的好東西...

用 Psalm 掃出 PHP 有問題的程式碼

Psalm 的 slogan 是「A static analysis tool for PHP」,由 Vimeo 發展並開放出來的軟體:「vimeo/psalm」。

目前是 v0.3.71,所以需要 PHP 5.6 以上才能跑:

  • v0.3.x supports checking PHP 5.4 - 7.1 code, and requires PHP 5.6+ to run.
  • v0.2.x supports checking PHP 5.4 - 7.0 code and requires PHP 5.4+ to run.

Psalm 主要的目標是找出哪邊「已經發生錯誤」,而不像其他幾套的目標是「預防」,這樣可以避免過高的 false alarm...

Archives