Mac (M1/M2) 上的 Asahi Linux 支援 OpenGL ES 3.1

在「The first conformant M1 GPU driver」這邊看到 Mac (M1 系列與 M2 系列) 上的 Asahi Linux 支援 OpenGL ES 3.1 了。

文章裡面有提到,目前 macOS 上沒有業界標準介面可以用:

Unlike ours, the manufacturer’s M1 drivers are unfortunately not conformant for any standard graphics API, whether Vulkan or OpenGL or OpenGL ES. That means that there is no guarantee that applications using the standards will work on your M1/M2 (if you’re not running Linux).

也許有機會會看到有人 backport 回 macOS 上?

Mac 會自己改變 Desktop 位置的問題

以前好像沒遇過,換了 M1 以後才注意到 desktop 位置位自己被改變,覺得很阿雜... 找了資料才發現是個 "feature":「How to prevent Mac from changing the order of Desktops/Spaces」。

關掉就好了,網路上的資料最早出現在 2018 年左右,大概是那個時候被加進去的?

Apple M1 上跑 Linux 的 GPU driver 會動了

Hacker News Daily 上看到「Native Linux GPU Driver for Apple M1 (twitter.com/linaasahi)」這個,講 Apple M1 上有 Linux GPU driver 可以用了,原文是 Twitter 上的推:

不過從 YouTube 的影片上可以看到就只是「會動」,還有不少 rendering bug,可以看到有時候會破格,但畢竟是開始支援了,後續如果有修到穩定的話,直接的好處應該就是把瀏覽器的 rendering 丟給 GPU 處理,就像 macOS 或是 Windows 上的情況。

幾個其他 Teams 的替代方案 (但還是連到 Teams 伺服器)

這邊講的替代方案不是換掉 Teams,而是找其他的方法連上 Teams 伺服器,畢竟用 Teams 的人大多都沒得選...

在「Teams is killing my Mac every day (microsoft.com)」這邊看到的一些資料可以嘗試,裡面有很多抱怨 Teams 的問題,但還是有些人有給出一些 workaround。

大家主要遇到的問題除了 CPU 吃很兇以外,另外就是記憶體這塊。

一種方法是是用 Edge 瀏覽器的 extension 來跑,我本來想看看 Linux 上的 Brave 能不能裝,但沒有看到對應的安裝連結,大概是 Edge 限定:

If you don't want to use the Microsoft Teams app (which uses a lot of resources), you can:

1. Install the Microsoft Edge Web browser on your Mac

2. Log into https://teams.microsoft.com

3. Click ... > Apps > Install this site as an app

This will create an Edge app for Teams that uses almost no resources but has feature parity with the regular Microsoft Teams app.

We tell all of our students to do this, and it has solved all Microsoft Teams performance issues on student Macs (both Intel and Apple Silicon).

另外有人提到其實官方是有放 M1 的 preview 版本的,雖然不是正式版,但總是比 Intel 版本會好一些:

If you're running an Apple Silicon Mac you can get an early build of Teams osx-arm64 from the exploration build link listed here.[0]

I've been running a daily build for a few weeks and it's noticeably better than the Intel build on an M1 Pro. It launches in half the time and feels far more responsive (probably due to not needing to use the Rosetta JIT for Electron). That said it's still a daily "exploration" build so YMMV.

[0] https://raw.githubusercontent.com/ItzLevvie/MicrosoftTeams-msinternal/master/defconfig

據說會少吃一點點記憶體,就真的大概一點點:

Can confirm it is snappier on a M1 Macbook Pro and using *less* RAM, maybe about 10% less.

但據說這個 preview 版本在自我更新時會跳到 Intel 版本,還要再找一下 workaround 關掉自動更新:

How do you prevent it from automatically updating to the Intel version? I keep downloading the preview builds and they keep getting updated.

後面還有看到有人說他直接實體隔離,把這些肥滋滋的 app 跑在另外一台 Mac 上,然後透過 Universal Control 使用,大多數的情況下都夠用,真的有需要分享畫面時再跑在自己機器上,用完就可以關掉:

Thanks for the tip. I'll give this a try!

For work, I have to run Microsoft Teams, Slack, and Discord. Of those 3, Slack surprisingly uses the least amount of memory (~700 MB), and Teams uses the most (~1.5 GB). I dusted off an old Intel Mac (literally) and interact with it using Universal Control. It only runs those 3 chat apps + mail. It's turned out to be a great way to offload resource hogs and as an added benefit, it minimizes distractions. I'll occasionally glance at the dock to see if there are any notification badges, whereas on my main Mac, I'd feel compelled to deal with notifications immediately.

When I have to share my screen or focus on a conversation, I'll fire up one of those 3 apps on my main (M1) Mac and quit it when I'm done.

Universal Control still feels rough around the edges, but it has saved me from ditching my Macbook Air and shelling out for an M1 Macbook Pro. Sometimes there are issues with reconnecting to the Intel Mac, but it seems to resolve itself if I wait a bit or turn off/on wifi.

大家都在找方法 XDDD

Amazon EC2 有 Mac (M1) 機種可以租用了

2020 年年底的時候 AWS 推出用 Mac mini 配合搭建出 Mac (Intel) 機種:「Amazon EC2 推出 Mac Instance」,當初有計畫在 2021 年推出 M1 的版本:

Apple M1 Chip – EC2 Mac instances with the Apple M1 chip are already in the works, and planned for 2021.

不過就沒什麼意外的 delay 了,這次則是推出了 M1 的版本:「New – Amazon EC2 M1 Mac Instances」。

依照說明看起來還是 Mac mini,掛上 AWS Nitro System

EC2 Mac instances are dedicated Mac mini computers attached through Thunderbolt to the AWS Nitro System, which lets the Mac mini appear and behave like another EC2 instance.

然後跟 Intel 版本一樣,因為是掛進 Dedicated Hosts 的計價方式,雖然是以秒計費,但還是設定最低 24 小時的租用時間限制:

Amazon EC2 Mac instances are available as Dedicated Hosts through both On Demand and Savings Plans pricing models. The Dedicated Host is the unit of billing for EC2 Mac instances. Billing is per second, with a 24-hour minimum allocation period for the Dedicated Host to comply with the Apple macOS Software License Agreement. At the end of the 24-hour minimum allocation period, the host can be released at any time with no further commitment.

Intel 版本代號是 mac1,只有一種機種 mac1.metal,M1 版本代號是 mac2,也只有一種機種 mac2.metal

以最經典的美東一區 us-east-1 來看,mac1.metal 的 on-demand 價錢是 US$1.083/hour,mac2.metal 則是 US$0.65/hour,差不多是 60% 的價錢,便宜不少,大概是反應在硬體攤提與電費成本上了。

另外目前大家用 M1 的經驗來看,Rostta 2 未必會比原生的機器慢多少,雖然 mac1.metal 是 12 cores,mac2.metal 是 8 core,但以雲上面一定要用 Mac 跑的應用來說,馬上想的到的還是綁在 Apple 環境裡 CI 類的應用?

目前看起來主要的問題還是 24 小時的最小計費單位讓彈性低不少...

在非 4K 的螢幕上跑 HiDPI

前幾天看到 BetterDummy 這個專案,作者在 M1 上面外接 24" 1440p 的螢幕,但沒辦法啟用 HiDPI,於是就寫了一個軟體來解:

M1 macs tend to have issues with custom resolutions. Notoriously they don't allow sub-4K resolution displays to have HiDPI ("Retina") resolutions even though (for example) a 24" QHD 1440p display would greatly benefit from having an 1920x1080 HiDPI "Retina" mode.

在這之前的解法都有些麻煩,一種是買個 dummy dongle 去騙 macOS,另外是用 mirror 的方式使用:

To fix this issue, some resort to buying a 4K HDMI dummy dongle to fool macOS into thinking that a 4K display is connected and then mirror the contents of this dummy display to their actual monitor in order to have HiDPI resolutions available. Others use the built in screens of their MacBooks to mirror to the external display. These approaches have obvious drawbacks and cannot solve all problems.

作者提供的軟體可以先建立 Dummy Monitor,然後再透過 mirror 掛到實際螢幕上:

不確定用起來如何,但如果之後有需要的話好像可以測試看看...

居然在安全性漏洞的 PoC 上面看到拿 Bad Apple!! 當作範例

人在日本的資安專家 Hector Martin 找到了 Apple M1 的安全漏洞,可以不用透過 macOS Big Sur 提供的界面,直接透過 M1 的漏洞跨使用者權限傳輸資料,這可以用在突破 sandbox 的限制。而也如同目前的流行,他取了一個好記的名字:「M1RACLES: M1ssing Register Access Controls Leak EL0 State」,對應的 CVECVE-2021-30747

先講比較特別的點,PoC 的影片放在 YouTube 上,作者拿 Bad Apple!! 當作示範,這很明顯是個雙關的點:

這應該是當年的影繪版本,看了好懷念啊... 當年看到的時候有種「浪費才能」的感覺,但不得不說是個經典。

Hacker News 上有討論可以翻翻:「M1racles: An Apple M1 covert channel vulnerability (m1racles.com)」。

依照作者的說明,Apple A14 因為架構類似,也有類似的問題,不過作者沒有 iPhone,沒辦法實際測試:

Are other Apple CPUs affected?

Maybe, but I don't have an iPhone or a DTK to test it. Feel free to report back if you try it. The A14 has been confirmed as also affected, which is expected, as it is a close relative of the M1.

另外作者覺得這個安全漏洞在 macOS 上還好,主要是你系統都已經被打穿可以操控 s3_5_c15_c10_1 register 了,應該會有更好的方式可以用:

So you're telling me I shouldn't worry?

Yes.

What, really?

Really, nobody's going to actually find a nefarious use for this flaw in practical circumstances. Besides, there are already a million side channels you can use for cooperative cross-process communication (e.g. cache stuff), on every system. Covert channels can't leak data from uncooperative apps or systems.

Actually, that one's worth repeating: Covert channels are completely useless unless your system is already compromised.

比較明顯的問題應該是 iOS 這邊的 privacy issue,不過 iOS 上的 app store 有基本的保護機制:(不過想到作者可以故意寫成 RCE 漏洞...)

What about iOS?

iOS is affected, like all other OSes. There are unique privacy implications to this vulnerability on iOS, as it could be used to bypass some of its stricter privacy protections. For example, keyboard apps are not allowed to access the internet, for privacy reasons. A malicious keyboard app could use this vulnerability to send text that the user types to another malicious app, which could then send it to the internet.

However, since iOS apps distributed through the App Store are not allowed to build code at runtime (JIT), Apple can automatically scan them at submission time and reliably detect any attempts to exploit this vulnerability using static analysis (which they already use). We do not have further information on whether Apple is planning to deploy these checks (or whether they have already done so), but they are aware of the potential issue and it would be reasonable to expect they will. It is even possible that the existing automated analysis already rejects any attempts to use system registers directly.

Google 釋出網頁版的 Spectre 攻擊 PoC,包括 Apple M1 在內

在大約三年前 (2018 年年初) 的時候,在讀完 Spectre 之後寫下了一些記錄:「讀書時間:Spectre 的攻擊方式」,結果在 Bruce Schneier 這邊看到消息,Google 前幾天把把 PoC 放出來了:「Exploiting Spectre Over the Internet」,在 Hacker News 上也有討論:「A Spectre proof-of-concept for a Spectre-proof web (googleblog.com)」。

首先是這個攻擊方法在目前的瀏覽器都還有用,而且包括 Apple M1 上都可以跑:

The demonstration website can leak data at a speed of 1kB/s when running on Chrome 88 on an Intel Skylake CPU. Note that the code will likely require minor modifications to apply to other CPUs or browser versions; however, in our tests the attack was successful on several other processors, including the Apple M1 ARM CPU, without any major changes.

即使目前的瀏覽器都已經把 performance.now() 改為 1ms 的精度,也還是可以達到 60 bytes/sec 的速度:

While experimenting, we also developed other PoCs with different properties. Some examples include:

  • A PoC which can leak 8kB/s of data at a cost of reduced stability using performance.now() as a timer with 5μs precision.
  • A PoC which leaks data at 60B/s using timers with a precision of 1ms or worse.

比較苦的消息是 Google 已經確認在軟體層沒辦法解乾淨,目前在瀏覽器上只能靠各種 isolation 降低風險,像是將不同站台跑在不同的 process 裡面:

In 2019, the team responsible for V8, Chrome’s JavaScript engine, published a blog post and whitepaper concluding that such attacks can’t be reliably mitigated at the software level. Instead, robust solutions to these issues require security boundaries in applications such as web browsers to be aligned with low-level primitives, for example process-based isolation.

Apple M1 也中這件事情讓人比較意外一點,看起來是當初開發的時候沒評估?目前傳言的 M1x 與 M2 不知道會怎樣...

Cloudflare 再次嘗試 ARM 伺服器

2018 年的時候寫過一篇 Cloudflare 在嘗試 ARM 伺服器的進展:「Cloudflare 用 ARM 當伺服器的進展...」,後來就沒有太多公開的消息,直到這幾天看到「ARMs Race: Ampere Altra takes on the AWS Graviton2」才看到原因:

By the time we completed porting our software stack to be compatible with ARM, Qualcomm decided to exit the server business.

所以是都測差不多,也都把 Cloudflare 自家的軟體搬上去了,但 Qualcomm 也決定收手,沒機器可以用...

這次再次踏入 ARM 領域讓人想到前陣子 AppleM1,讓大家看到 ARM 踏入桌機與筆電領域可以是什麼樣貌...

這次 Cloudflare 選擇了 Ampere Altra,這是基於 Neoverse N1 的平台,而這個平台的另外一個知名公司就是 AWSGraviton2,所以就拿來比較:

可以看到 Ampere Altra 的核心數多了 25% (64 vs. 80),運作頻率多了 20% (2.5Ghz vs. 3.0Ghz)。測試的結果也都有高有低,落在 10%~40% 都有。

不過其中比較特別的是 Brotli - 9 的測試特別差 (而且是 8 與 10 都正常的情況下):

依照 Cloudflare 的說法,他們其實不會用到 Brotli - 7 以及更高的等級,不過畢竟有測出來,還是花了時間找一下根本原因:

Although we do not use Brotli level 7 and above when performing dynamic compression, we decided to investigate further.

反追問題後發現跟 Page Faults 以及 Pipeline Backend Stalls 有關,不過是可以改寫避開,在避開後可以達到跟 Graviton2 類似的水準:

By analyzing our dataset further, we found the common underlying cause appeared to be the high number of page faults incurred at level 9. Ampere has demonstrated that by increasing the page size from 4K to 64K bytes, we can alleviate the bottleneck and bring the Ampere Altra at parity with the AWS Graviton2. We plan to experiment with large page sizes in the future as we continue to evaluate Altra.

但目前看起來應該都還算正向,看起來供貨如果穩定的話,應該有機會換過去?畢竟 ARM 平台可以省下來的電力太多了,現在因為 M1 對 ARM 的公關效果太驚人的關係,解釋起來會更輕鬆...

Apple M1 的效能與省電原因

Hacker News Daily 上看到 Apple M1 為什麼這麼快又省電的解釋,可以當作一種看法:

可以在 Thread reader 上面讀:「Thread by @ErrataRob on Thread Reader App – Thread Reader App」。

看起來 Apple 在規劃的時候就有考慮 x86 模擬問題,所以在記憶體架構上直接實做了對應的模式,大幅降低了當年 MicrosoftSurface 上遇到的問題:

3/ The biggest hurdle was "memory-ordering", the order in which two CPUs see modifications in memory by each other. It's the biggest problem affecting Microsoft's emulation of x86 on their Arm-based "Surface" laptops.

4/ So Apple simply cheated. They added Intel's memory-ordering to their CPU. When running translated x86 code, they switch the mode of the CPU to conform to Intel's memory ordering.

另外一個比較有趣的架構是,Apple M1 上面的兩個 core 有不同的架構,一顆對效能最佳化,另外一顆對效率最佳化:

13/ Apple's strategy is to use two processors: one designed to run fast above 3 GHz, and the other to run slow below 2 GHz. Apple calls this their "performance" and "efficiency" processors. Each optimized to be their best at their goal.

在 wikipedia 上的介紹也有提到這兩個 core 的不同,像是 L1 cache 的差異 (128KB 與 192KB),以及功耗的差異:

The M1 has four high-performance "Firestorm" and four energy-efficient "Icestorm" cores, providing a configuration similar to ARM big.LITTLE and Intel's Lakefield processors. This combination allows power-use optimizations not possible with Apple–Intel architecture devices. Apple claims the energy-efficient cores use one tenth the power of the high-performance ones. The high-performance cores have 192 KB of instruction cache and 128 KB of data cache and share a 12 MB L2 cache; the energy-efficient cores have a 128 KB instruction cache, 64 KB data cache, and a shared 4 MB L2 cache. The Icestorm "E cluster" has a frequency of 0.6–2.064 GHz and a maximum power consumption of 1.3 W. The Firestorm "P cluster" has a frequency of 0.6–3.204 GHz and a maximum power consumption of 13.8 W.

再加上其他架構上的改善 (像是針對 JavaScript 的指令集、L1 的提昇,以及用 TSMC 最新製程),累積起來就變成把 Intel 版本壓在地上磨蹭的結果了...