Apple M1 的效能與省電原因

Hacker News Daily 上看到 Apple M1 為什麼這麼快又省電的解釋,可以當作一種看法:

可以在 Thread reader 上面讀:「Thread by @ErrataRob on Thread Reader App – Thread Reader App」。

看起來 Apple 在規劃的時候就有考慮 x86 模擬問題,所以在記憶體架構上直接實做了對應的模式,大幅降低了當年 MicrosoftSurface 上遇到的問題:

3/ The biggest hurdle was "memory-ordering", the order in which two CPUs see modifications in memory by each other. It's the biggest problem affecting Microsoft's emulation of x86 on their Arm-based "Surface" laptops.

4/ So Apple simply cheated. They added Intel's memory-ordering to their CPU. When running translated x86 code, they switch the mode of the CPU to conform to Intel's memory ordering.

另外一個比較有趣的架構是,Apple M1 上面的兩個 core 有不同的架構,一顆對效能最佳化,另外一顆對效率最佳化:

13/ Apple's strategy is to use two processors: one designed to run fast above 3 GHz, and the other to run slow below 2 GHz. Apple calls this their "performance" and "efficiency" processors. Each optimized to be their best at their goal.

在 wikipedia 上的介紹也有提到這兩個 core 的不同,像是 L1 cache 的差異 (128KB 與 192KB),以及功耗的差異:

The M1 has four high-performance "Firestorm" and four energy-efficient "Icestorm" cores, providing a configuration similar to ARM big.LITTLE and Intel's Lakefield processors. This combination allows power-use optimizations not possible with Apple–Intel architecture devices. Apple claims the energy-efficient cores use one tenth the power of the high-performance ones. The high-performance cores have 192 KB of instruction cache and 128 KB of data cache and share a 12 MB L2 cache; the energy-efficient cores have a 128 KB instruction cache, 64 KB data cache, and a shared 4 MB L2 cache. The Icestorm "E cluster" has a frequency of 0.6–2.064 GHz and a maximum power consumption of 1.3 W. The Firestorm "P cluster" has a frequency of 0.6–3.204 GHz and a maximum power consumption of 13.8 W.

再加上其他架構上的改善 (像是針對 JavaScript 的指令集、L1 的提昇,以及用 TSMC 最新製程),累積起來就變成把 Intel 版本壓在地上磨蹭的結果了...

Amazon EC2 增加 T2 instance

Amazon EC2 增加了新的 T2 instance:「New Low Cost EC2 Instances with Burstable Performance」。

T2 系列出了三個等級:t2.micro (1GB)、t2.small (2GB)、t2.medium (4GB)。以 us-east-1 的 t2.micro 價錢來看,只貴 t1.micro 一點點 (USD$0.012/hour 與 USD$0.013/hour),但記憶體大了不少 (640MB 與 1GB)。

另外推出了 CPU Credits 這種計算方式,可以累計 24 小時的 CPU Credits。我在想,AWS 能夠推出這個機制,是已經做到像是 VMware 的 vMotion 之類的不停機遷移嗎?對於在 10Gbps 的 1GB RAM 上的確是不用一秒鐘就可以傳完 RAM 的內容...

CPU Credits 這個機制跟 auto scaling 解決問題的方向有點不太一樣,但也是還不錯的方法... 拿來打組合拳應該還不錯 :p

另外一個比較特別的是在文末有提到對 m1.small 與 m1.medium 的想法。t2.{small,medium} 被認為是 m1.{small,medium} 的接班人 (之一?):

  • t1.micro to t2.micro
  • m1.small to t2.small
  • m1.medium to t2.medium

其中 m3.medium 之前是被認為是 m1.medium 的接班人,看起來雖然都是 General Purpose,但打算多分幾種不同的應用來滿足需求。

可以在 AWS VPC 內開 Micro Instance 了...

之前在 AWS VPC (Virtual Private Cloud) 能開的最小台機器是 m1.small,而 AWS 總算是宣佈可以在 VPC 裡開更小台的 t1.micro 了:「Amazon VPC now Supports Micro Instances」、「Launch EC2 Micro Instances in a Virtual Private Cloud」。

剛好可以拿來當 VPC 的 NAT server 來用...