美國政府禁止 NVIDIA 將高階顯卡輸出到中國與俄羅斯

Hacker News 首頁上看到「US Government Bans Export of Nvidia A100 and H100 GPUs to China and Russia (sec.gov)」這篇,是 NVIDIA 發出了 Form 8-K,說明美國政府禁止 A100 與 H100 或是更高階 (更快) 的卡以及產品輸出到中國 (包括香港) 與俄羅斯:「nvda-20220826.htm」。

先是指出 A100、H100 以及 A100X (Ampere) 被管制:

On August 26, 2022, the U.S. government, or USG, informed NVIDIA Corporation, or the Company, that the USG has imposed a new license requirement, effective immediately, for any future export to China (including Hong Kong) and Russia of the Company’s A100 and forthcoming H100 integrated circuits. DGX or any other systems which incorporate A100 or H100 integrated circuits and the A100X are also covered by the new license requirement.

另外是禁止新產品的部份,效能與 A100 相等或是更好的卡也被禁止輸出,除非有取得授權:

The license requirement also includes any future NVIDIA integrated circuit achieving both peak performance and chip-to-chip I/O performance equal to or greater than thresholds that are roughly equivalent to the A100, as well as any system that includes those circuits.

然後有提到軍事相關考量:

A license is required to export technology to support or develop covered products. The USG indicated that the new license requirement will address the risk that the covered products may be used in, or diverted to, a ‘military end use’ or ‘military end user’ in China and Russia. The Company does not sell products to customers in Russia.

有看到一些報導指出 AMD 也有收到類似的禁令 (畢竟也是個顯卡大廠),但在「SEC Filings」這邊沒看到...

Cloudflare 再次嘗試 ARM 伺服器

2018 年的時候寫過一篇 Cloudflare 在嘗試 ARM 伺服器的進展:「Cloudflare 用 ARM 當伺服器的進展...」,後來就沒有太多公開的消息,直到這幾天看到「ARMs Race: Ampere Altra takes on the AWS Graviton2」才看到原因:

By the time we completed porting our software stack to be compatible with ARM, Qualcomm decided to exit the server business.

所以是都測差不多,也都把 Cloudflare 自家的軟體搬上去了,但 Qualcomm 也決定收手,沒機器可以用...

這次再次踏入 ARM 領域讓人想到前陣子 AppleM1,讓大家看到 ARM 踏入桌機與筆電領域可以是什麼樣貌...

這次 Cloudflare 選擇了 Ampere Altra,這是基於 Neoverse N1 的平台,而這個平台的另外一個知名公司就是 AWSGraviton2,所以就拿來比較:

可以看到 Ampere Altra 的核心數多了 25% (64 vs. 80),運作頻率多了 20% (2.5Ghz vs. 3.0Ghz)。測試的結果也都有高有低,落在 10%~40% 都有。

不過其中比較特別的是 Brotli - 9 的測試特別差 (而且是 8 與 10 都正常的情況下):

依照 Cloudflare 的說法,他們其實不會用到 Brotli - 7 以及更高的等級,不過畢竟有測出來,還是花了時間找一下根本原因:

Although we do not use Brotli level 7 and above when performing dynamic compression, we decided to investigate further.

反追問題後發現跟 Page Faults 以及 Pipeline Backend Stalls 有關,不過是可以改寫避開,在避開後可以達到跟 Graviton2 類似的水準:

By analyzing our dataset further, we found the common underlying cause appeared to be the high number of page faults incurred at level 9. Ampere has demonstrated that by increasing the page size from 4K to 64K bytes, we can alleviate the bottleneck and bring the Ampere Altra at parity with the AWS Graviton2. We plan to experiment with large page sizes in the future as we continue to evaluate Altra.

但目前看起來應該都還算正向,看起來供貨如果穩定的話,應該有機會換過去?畢竟 ARM 平台可以省下來的電力太多了,現在因為 M1 對 ARM 的公關效果太驚人的關係,解釋起來會更輕鬆...