Google 提出的 GoogleSQL (Pipe 版本的 SQL 改良)

看到「SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL」這個研究投稿,PDF 檔案在 1004848.pdf 這邊。

Google 提出了 GoogleSQL 改善本來 SQL 的可讀性問題,另外也對 SQL optimizer 更有幫助。

直接拿 PDF 裡面的例子來說明,把本來是這樣的 SQL:

SELECT c_count, COUNT(*) AS custdist
FROM
  ( SELECT c_custkey, COUNT(o_orderkey) c_count
    FROM customer
    LEFT OUTER JOIN orders ON c_custkey = o_custkey
         AND o_comment NOT LIKE '%unusual%packages%'
    GROUP BY c_custkey
  ) AS c_orders
GROUP BY c_count
ORDER BY custdist DESC, c_count DESC;

變成這樣的:

FROM customer
|> LEFT OUTER JOIN orders ON c_custkey = o_custkey
AND o_comment NOT LIKE '%unusual%packages%'
|> AGGREGATE COUNT(o_orderkey) c_count
GROUP BY c_custkey
|> AGGREGATE COUNT(*) AS custdist
GROUP BY c_count
|> ORDER BY custdist DESC, c_count DESC;

可以看到這個語法除了變好讀以外,也指示了 SQL optimizer 怎麼去過濾與組合資料。

依照 Google 的說明,GoogleSQL 已經在許多 Google 的系統上實作了:

We’ve implemented pipe syntax in GoogleSQL, the SQL dialect and implementation shared across all1 SQL systems at Google including F1, BigQuery, Spanner and Procella, and the open-source release, ZetaSQL. GoogleSQL is a shared, reusable component, enabling many systems to share the same SQL dialect. This shared component allowed implementing pipe syntax in one place and then enabling it across many products.

裡面提到的 ZetaSQL 可以在 pipe-syntax.md 這邊看到。

我記得其他家也有類似的東西,PRQL 也是在解決類似的問題,不過語法走向不太一樣。

可以看看怎麼收斂...

找出並聯絡 AirPods 的失主

從「Did you lose your AirPods? (alexyancey.com)」這邊看到的,原文是「Did you lose your AirPods?」。

如同標題提到的,作者朋友撿到 AirPods,接上 iPhone 後能得到的資訊除了序號以外只有失主的電話末四碼了,不知道怎麼聯絡失主。

接下來作者開始想辦法,先假設失主是在同一個州,這樣電話的區碼 (前三碼) 就會一樣了,中間三碼的排列組合就只剩下 1000 組 (作者寫 999):

I started with the assumption that the owner lived near me in the Portland metropolitan area. With that, I restricted the search to our local area code*. Sure, they could be from out of town, but hey, let's give it a shot.

接著是透過公開的資料庫查有哪些電話號碼是有效,而且屬於行動網路的,這樣就降到 232 組:

Next, I narrowed it down by central office code (commonly called prefix) (those three digits after the area code). Most of Portland’s are assigned, but only 26% to wireless carriers. Also, 000-199 are reserved codes that aren't available for telcos. I lied earlier, sorry.

再接著,既然是 Apple 生態系的,作者決定用 iMessage Lookup API 去掃,這樣就剩下 84 組:

It's a safe bet that the owner has an iPhone with iMessage turned on. We can use this assumption to narrow down the list further by filtering out any non-iMessage phone numbers. I ran a check using this API. (The API is probably used for shady stuff, but my intentions were pure.)

最後是透過 MacBook 直接打 iMessage 出去問,就不用花錢透過簡訊聯絡:

With the list whittled down, I avoided Twilio entirely by using a script on my MacBook to send iMessages in bulk. Now we wait.

最後真的找到失主並且順利還給對方了:

回頭看裡面用到的兩個小技巧 (公開資料庫的查詢與 iMessage 的查詢),有蠻濃厚的 OSINT 味道,還蠻有趣的...

Android 規劃 16KB Page Size

Hacker News 上看到 Android 在規劃 16KB Page Size 的消息:「Adding 16 kb page size to Android (googleblog.com)」,原文是「Adding 16 KB Page Size to Android」。

Android 目前主力是 4KB Page Size,但因為 ARM 有支援 16KB Page Size,所以測試後發現雖然會多用 9% 記憶體,但效能會增加不少:

Most CPUs today support a 4 KB page size and so the Android OS and applications have historically been built and optimized to run with a 4 KB page size. ARM CPUs support the larger 16 KB page size. When Android uses this larger page size, we observe an overall performance boost of 5-10% while using ~9% additional memory.

雖然說 overall performance 有提升,但在「Benefits and performance gains」這邊可以看到目前列出來的項目都是特定的行為 (像是 app 啟動,camera 啟動,或是開機時間),這點可能要再看看其他的評測。

16KB Page Size 應該是個取捨,畢竟手機不像伺服器隨便就是上百 GB RAM 甚至到 TB RAM,跑一些吃記憶體的應用用 Huge Page 拉到 MB 等級的 Page Size 比較划算...

Telegram 的 CEO Pavel Durov 在法國機場被收押

好幾個地方都有報導 Telegram 的 CEO Pavel Durov 在法國的機場被收押的事情,引 Reuters 的好了:「Telegram messaging app CEO Durov arrested in France」。

目前傳言的理由是缺乏管理,允許犯罪行為:

TF1 and BFM both said the investigation was focused on a lack of moderators on Telegram, and that police considered that this situation allowed criminal activity to go on undeterred on the messaging app.

算是剛發生的新聞,接下來幾天應該會有後續的消息...

關閉 Firefox 的廣告蒐集功能

前陣子有提到 Mozilla 併購了廣告公司後,就在 Firefox 裡面實作廣告投放的蒐集功能:「Firefox 128 實作廣告投放蒐集功能 (Ad measurement)」,然後看到「Make Firefox Private Again (make-firefox-private-again.com)」這篇,網站是 make-firefox-private-again.com

其中的設定就是把 dom.private-attribution.submission.enabled 設為 false,透過 GUI 介面改記得也是類似的效果。

但 Mozilla 已經變成廣告公司了,後續應該只會有更多動作;而廣告公司熟悉的 PR 方法應該都會被拿出來用,至於對 open source community 有沒有用就再看看了...

網站名字的梗... 應該就是 Donald TrumpMake America Great Again 口號。

OpenBSD 程式碼的忒修斯之船

在「We have reached OpenBSD of Theseus (marc.info)」這邊看到的有趣主題,連結到的是 OpenBSD 的 commit log:「CVS: cvs.openbsd.org: src」。

The greek quiz is so obscure that it is ridiculous -- noone can play this. Replace it with a new quiz about galley (ship) parts. This commit changes the *LAST UNMODIFIED ORIGINAL FILE* (meaning revision 1.1.1.1) from the original import that created OpenBSD on Oct 18, 1995. With this commit, we have completed an amusing mission of replacing the final parts of the original OpenBSD.

We have reached OpenBSD of Theseus.

這邊提到的 Theseus 是指「忒修斯之船 (Ship of Theseus)」,引用中文維基百科的說明比較好理解:

如果忒修斯的船上的木頭逐漸被替換,直到所有的木頭都不是原來的木頭,那這艘船還是原來的那艘船嗎?

回到 OpenBSD 來說,這次的這個修改據說是最後一個沒有被動過的檔案,所以在改完以後才會說「We have reached OpenBSD of Theseus.」,是個有趣的里程碑。

AWS KMS 支援 ECDH

看到「Announcing AWS KMS Elliptic Curve Diffie-Hellman (ECDH) support」這篇的介紹,AWS KMS 支援 ECDH 了。

AWS 的文件「DeriveSharedSecret」這邊可以看到就是在不將 private key 暴露出來的情況下得到 ECDH 產生的 shared secret:

The private key in your KMS key pair never leaves AWS KMS unencrypted. DeriveSharedSecret returns the raw shared secret.

翻了一下其他兩個雲的 Cloud HSM 類服務,好像沒有看到 ECDH 的,不過如果是實際硬體 HSM 的話,Azure Dedicated HSM 似乎有支援,可以在 FAQ 這邊看到:

Dedicated HSM service provisions Thales Luna 7 HSM appliances.

Cryptography (ECDSA, ECDH, Ed25519, ECIES) with named, user-defined, and Brainpool curves, KCDSA

AWS KMS 畢竟是軟體基底的,要支援什麼演算法可以直接加...

Microsoft Authenticator 的長年 bug

在「Flaw has Microsoft Authenticator overwriting MFA accounts, locking users out (csoonline.com)」這邊看到的,原文在「Design flaw has Microsoft Authenticator overwriting MFA accounts, locking users out」,在講 Microsoft Authenticator (Android 版iOS 版) 這個支援 TOTP 的 MFA 程式的長年 bug... (對一般人比較好理解的,這是六位數字的動態密碼 app)

會造成無法登入的 bug 是因為透過 QR code scan 加入新的帳號時,會蓋掉既有的帳號資料,所以產生的 QR code 就無法在舊的帳號/網站上面使用了:

That’s because, due to an issue involving which fields it uses, Microsoft Authenticator often overwrites accounts when a user adds a new account via QR scan — the most common method of doing so.

原因是因為 username 相同就會蓋掉,而大多數人在不同的地方都會用同樣的 username (像是我的 gslin):

The core of the problem? Microsoft Authenticator will overwrite an account with the same username. Given the prominent use of email addresses for usernames, most users’ apps share the same username. Google Authenticator and just about every other authenticator app add the name of the issuer — such as a bank or a car company — to avoid this issue. Microsoft only uses the username.

然後 workaround 是不要用 Microsoft Authenticator,或是不要用 QR code scan:

There are multiple workarounds. The easiest is for companies to use any other authentication app. Not using the QR code scan feature — and manually entering the code — will also sidestep the issue, which doesn’t appear to arise when the authenticated accounts belong to Microsoft.

然後這個問題可以找到 2020 年開始有人抱怨,但作者測試看起來 2016 年的版本就已經是這樣了:

CSO Online found complaints of this problem dating back to 2020, but it appears to have been in place since Microsoft Authenticator was released in June 2016. (For historical context, Google was the first Authenticator app, having been launched in 2010.)

然後 Microsoft 確認有這樣的行為,但不認為是 bug 而是 feature (怎麼梗圖突然從腦袋裡冒出來...):

Microsoft confirmed the issue but said it was a feature not a bug, and that it was the fault of users or companies that use the app for authentication.

然後專欄作者找了其他專家測試其他的 app,可以發現只有 Microsoft Authenticator 的處理是 override 然後炸掉:

By the way, I’ve tested this behavior in 14 other authenticator apps so far. None of them exhibit the same collision behavior that Microsoft Authenticator does,” he added. “I gave up at 14 because at that point, it’s obvious Microsoft are the ones who are doing things poorly here.

大概是大家都懶得吵了,反正可以用 Google Authenticator 或是其他 TOTP app...

Homebrew 上的降版與鎖版

標題是在找到「vim-indent-guides doesn't work on Neovim 0.10.x (on macOS)」這個問題時用到的方法,Homebrew 在新版不支援直接透過 command line 降版 (網路上可以找到不少 Homebrew 舊版的處理方法,但已經不能用了),就變得很麻煩...

這邊以 Neovim 為例子,可以找到對應的 formulae 在「neovim — Homebrew Formulae」這邊,裡面有 Formula code 的連結指到特定的 commit id 上,我這邊則是指到 master branch 上:「neovim.rb」。

接下來透過 History 頁面可以看到這個檔案被修改的記錄:「History for Formula/n/neovim.rb - Homebrew/homebrew-core」,可以看出來 commit 後會有機器人 BrewTestBot 編 binary package 然後更新文件內的 hash 資訊 (官方稱作 Bottles)。

接下來就是要挑一個版本點進去,我這邊舉例抓的是 0.9.5 的「neovim: update 0.9.5 bottle.」這個,然後從三個點的 icon 上點開,裡面的 View file 點下去後就會出現單一檔案的畫面,裡面有 Raw 可以點,就可以抓到像是 https://raw.githubusercontent.com/Homebrew/homebrew-core/63aa44faba5b5274a1a7579510cd5a570a2cca5f/Formula/n/neovim.rb 這樣的連結了。

這邊的檔案名稱不能換,但為了測試方便,我的方法是把不同 Neovim 版本的檔案都抓下來改檔名成 neovim-0.10.1.rbneovim-0.10.0.rb 以及 neovim-0.9.5.rb,後續再用 soft link 的方式 ln -fs neovim-0.10.1.rb neovim.rb 讓 Homebrew 讀到 neovim.rb 的檔案名稱,像是這樣:

wget -O neovim-0.9.5.rb https://github.com/Homebrew/homebrew-core/blob/63aa44faba5b5274a1a7579510cd5a570a2cca5f/Formula/n/neovim.rb
wget -O neovim-0.10.0.rb https://raw.githubusercontent.com/Homebrew/homebrew-core/5990ac73d16c4544d85fc67dad4405dda1e74fd6/Formula/n/neovim.rb
wget -O neovim-0.10.1.rb https://github.com/Homebrew/homebrew-core/blob/a75fe3a7ab97f97dc9999e7390184869be73227c/Formula/n/neovim.rb
ln -sf neovim-0.9.5.rb neovim.rb

Wget 或是 cURL 抓下來後就可以要 Homebrew 安裝了,在安裝前要記得先 uninstall 現有的版本:

brew uninstall neovim
brew install -s neovim.rb

以 Neovim 來說,跑完後用 nvim --version 就可以看有沒有裝到對的版本,其他軟體應該也會有對應的版本資訊可以看。

鎖定版本以及解除鎖定比較簡單,用 pinunpin 處理就可以了:

brew pin neovim
brew unpin neovim