NVIDIA 開源 Linux GPU Kernel Driver

NVIDIA 宣佈開源 Linux 下的 GPU Kernel Driver:「NVIDIA Releases Open-Source GPU Kernel Modules」。

從一些描述上可以看出來,應該是因為 Datacenter 端的動力推動的,所以這次 open source 的版本中,對 Datacenter GPU 的支援是 production level,但對 GeForce GPU 與 Workstation GPU 的支援直接掛 alpha level:

Which GPUs are supported by Open GPU Kernel Modules?

Open kernel modules support all Ampere and Turing GPUs. Datacenter GPUs are supported for production, and support for GeForce and Workstation GPUs is alpha quality. Please refer to the Datacenter, NVIDIA RTX, and GeForce product tables for more details (Turing and above have compute capability of 7.5 or greater).

然後 user-mode driver 還是 closed source:

Will the source for user-mode drivers such as CUDA be published?

These changes are for the kernel modules; while the user-mode components are untouched. So the user-mode will remain closed source and published with pre-built binaries in the driver and the CUDA toolkit.

nouveau 來說,是可以從 open source driver 裡面挖一些東西出來用,不過能挖到跟 proprietary 同樣效能水準嗎?

去年 OVH 機房大火的部份情形最近被揭露

去年三月在「OVH 法國機房 SBG2 火災全毀」這邊有提到 OVH 的機房大火事件,最近有些情況總算被揭露出來了。

Hacker News 上看到「OVHcloud fire report: SBG2 data center had wooden ceilings, no extinguisher, and no power cut-out」這篇這個月月初的報導,裡面題到了一些情形。另外對應的討論在「OVHcloud fire: SBG2 data center had no extinguisher, no power cut-out (datacenterdynamics.com)」這邊可以看到。

其中一個是在講消防隊花了三個小時才把電力切斷,因為電力室外面有非常強力的電弧:

According to the report firefighters on the scene found electrical arcs more than one meter long flashing around the door to the power room, and it took three hours to cut off the power supply because there was no universal cut-off.

另外電力室本身的設計也不利於防火 (木造天花板?),而且電力管道也沒有隔離:

The power room had a wooden ceiling designed to withstand fire for one hour, and the electrical ducts were not insulated.

另外因為節能的設計,他們設計了多個通道讓外部的空氣容易進入 data center 交換熱量,但這也導致了火苗很不容易熄滅:

Once the fire escaped from the power room, it grew rapidly. The report says that "the two interior courtyards acted as fire chimneys". JDN claims the spread of fire may have been accelerated by the site's free cooling design, which is designed to encourage the flow of outside air through the building to cool the servers.

OVH 目前因為訴訟的關係,基本上都是拒絕評論...

Cloudflare 在那霸設點

Cloudflare 宣佈增加了一堆 PoP:「New cities on the Cloudflare global network: March 2022 edition」。

日本在這波加了兩個:

  • Fukuoka, Japan
  • Naha, Japan

福岡不算太意外,畢竟東京與大阪都已經有 PoP 了,以距離來說,開在福岡算是去吃九州的人口,而且以流量來說應該也夠大,是可以投資的點。

而沖繩那霸就真的比較意外了,如果要猜的話,也許就是談福岡的時候順便一起談?

從 Mozilla 官網下載的 Firefox 帶有追蹤用的標籤

前天看到「Each Firefox download has a unique identifier」這篇報導,就順手貼到 Hacker News 上面了:「Each Firefox download has a unique identifier (ghacks.net)」。

簡單的說就是 Mozilla 在 Firefox 的 binary 裡面加上 download token,後續就可以追蹤使用者:「[meta] Support download token」。

依照報導所提到的,每次下載 binary 都會有不同的 token:

在「Attached file dltoken_data_review.md — Details」裡面有回答更多細節,像是跟 Google Analytics 綁定:

5) List all proposed measurements and indicate the category of data collection for each measurement, using the [Firefox data collection categories](https://wiki.mozilla.org/Firefox/Data_Collection) found on the Mozilla wiki.   

<table>
  <tr>
    <td>Measurement Description</td>
    <td>Data Collection Category</td>
    <td>Tracking Bug #</td>
  </tr>
  <tr>
    <td>A download token that uniquely corresponds to a Google Analytics ID</td>
    <td>Category 4 "Highly sensitive or clearly identifiable personal data"</td>
    <td>Bug 1677497</td>
  </tr>
</table>

我自己重製不出來 (都是被導去 CloudFront),但留言區裡面的 Yuliya 透過 Tor 有重製出來:

I have tried some TOR exit nodes:

Name: Firefox Setup 98.0.1_germany.exe
Size: 55528896 bytes (52 MiB)
SHA256: 2d8164d547d8a0b02f2677c05e21a027dc625c0c1375fd34667b7d039746d400
SHA1: 71302acbee6895b84cf0dfae99050926f2db59ef

Name: Firefox Setup 98.0.1_austria.exe
Size: 55528896 bytes (52 MiB)
SHA256: a139a45dd5737ab981068ca2596b7fdfde15e5d4bc8541e0a2f07a65defd3e4e
SHA1: 28630a0aababa162ca9e7cbca51e50b76b9c3cff

I have labeled the file for the corresponding country of the exit node.

如果不願意換到 Chromium-based 的方案,目前在討論裡看到的替代方案是 LibreWolf,昨天裝起來後發現還行,應該也可以測試看看...

Amazon EFS 效能提昇的一些討論

上一篇「Amazon EFS 的效能提昇」提到 Amazon EFS 的效能提昇,在 Hacker News 上看到 Amazon EFS 團隊的 PMT (Product-Manager-Technical) 出來回一些東西:「Amazon Elastic File System Update – Sub-Millisecond Read Latency (amazon.com)」,搜尋 geertj 應該就可以看到他回的東西了...

像是即使是 Jeff Barr 發表這篇文章,也還是經過 legal team 的同意才能發表:

(PMT on the EFS team).

Yes, the wordings are carefully formulated as they have to be signed off by the AWS legal team for obvious reasons. With that said, this update was driven by profiling real applications and addressing the most common operations, so the benefits are real. For example, a simple WordPress "hello world" is now about 2x as fast as before.

另外這次的效能提昇是透過 cache 層達成的:

I'm the PMT for this project in the EFS team. The "flip the switch" part was indeed one of the harder parts to get right. Happy to share some limited details. The performance improvement builds on a distributed consistent cache. You can enable such a cache in multiple steps. First you deploy the software across the entire stack that supports the caching protocol but it's disabled by configuration. Then you turn it for the multiple components that are involved in the right order. Another thing that was hard to get right was to ensure that there are no performance regressions due to the consistency protocol.

然後在每個 AZ 都有 cache:

The caches are local to each AZ so you get the low latency in each AZ, the other details are different. Unfortunately I can't share additional details at this moment, but we are looking to do a technical update on EFS at some point soon, maybe at a similar venue!

另外看起來主要就是 metadata cache 的幫助:

NFS workloads are typically metadata heavy and highly correlated in time, so you can achieve very high hit rates. I can't share any specific numbers unfortunately.

還是有很多細節數字不能透漏,但知道是透過 cache 達成的就已經可以大致上想像後面是怎麼弄出來的了...

日本囯土地理院提供的日本地圖圖資

查資料的時候發現日本的囯土地理院有開放使用日本的地圖圖資:「地理院タイル一覧」。

圖資放在 cyberjapandata.gsi.go.jp 這個網址上,前面掛 Fastly 的 CDN:

;; ANSWER SECTION:
cyberjapandata.gsi.go.jp. 300   IN      CNAME   d.sni.global.fastly.net.
d.sni.global.fastly.net. 30     IN      A       199.232.46.133

抓個鶯谷園的點出來看:

然後可以看到非常多不同的資料,像是「1928年頃」、「活断層図(都市圏活断層図)」、「明治期の低湿地」、「磁気図2010.0年値」,另外針對比較大的災難也有提供一些區域圖,像是「平成25年7月17日からの大雨 山口地方「須佐地区」正射画像(2013年7月31日撮影)」。

另外因為 url 形式是 https://cyberjapandata.gsi.go.jp/xyz/std/{z}/{x}/{y}.png 這種格式,可以直接用 Leaflet 這類 library 吃進去...

在資料裡面經緯度的擺放順序:先經度再緯度,或是先緯度再經度?

Hacker News Daily 上看到「lon lat lon lat」這篇蠻有趣的 (然後作者也很無奈),整理出來目前各種 spec 內怎麼擺放經緯度的方式...

重點在這張:

台灣因為中文裡會講「經緯度」,習慣就是先經後緯,也就是 (lon, lat) 的形式,作者的偏好也是一樣,但世界上還是有其他地區的習慣是反過來的 XD

Hacker News 裡的討論「Lon Lat Lon Lat (macwright.com)」可以看出來國際標準內也是都有,所以大概不會有標準答案...

身為工程師,把這個現象當作某種容易中獎的 pitfall,遇到這類資料的時候要有習慣去確認 spec 定義的順序就可以了...

德國的地方法院說使用 Google Fonts 服務沒有告知使用者違反 GDPR

看到「German Court Rules Websites Embedding Google Fonts Violates GDPR」這篇,雖然不是最終判決,但總是個開始:

A regional court in the German city of Munich has ordered a website operator to pay €100 in damages for transferring a user's personal data — i.e., IP address — to Google via the search giant's Fonts library without the individual's consent.

因為 GDPR 內把 IP address 資訊視為 PII,所以看起來任何 3rd-party 的內嵌服務應該都會受到影響,來追起來看一下後續的發展好了...

Amazon EFS 提供 Replication 功能

Jeff Barr 在官方 blog 上宣佈 Amazon EFS 提供 replication 功能:「New – Replication for Amazon Elastic File System (EFS)」。

可以看到跨區的設定畫面:

在建起來以後會是 read-only filesystem:

另外有提供 fail-over 機制,當 fail-over 過去後會從 read-only 變成 read-write。

不過要注意,架構上屬於 eventually consistent,預期是一分鐘內會更新。這點算是可以預期的,不然 latency 會太高:

All replication traffic stays on the AWS global backbone, and most changes are replicated within a minute, with an overall Recovery Point Objective (RPO) of 15 minutes for most file systems.

然後 replication 不會計算到 I/O 的 credit 與 throughput,算是比較特別的一點:

Replication does not consume any burst credits and it does not count against the provisioned throughput of the file system.

replication 這個服務本身不另外收費,只收取 EFS 使用的空間以及 replication 產生的頻寬費用:

You pay the usual storage fees for the original and replica file systems and any applicable cross-region or intra-region data transfer charges.

curl 的 command line 工具也要支援 JSON 格式了

Daniel Stenbergcurl 的 mailing list 上宣佈要支援 JSON 格式:「JSON support」,用法在「JSON awareness in the curl tool」在這邊可以看到,另外在 Hacker News 上的討論「The time might come when we add some JSON specific command line options (curl.se)」也可以翻翻。

討論裡面馬上有提到 HTTPie,這個套件基本上也算是標配了 (而且在各大 distribution 的 package 都有內建,直接裝就可以了),可以看到主要是處理 POST 輸入時的 JSON 部份 (在 Content-Type: application/json 的情境下),HTTP response 輸出的部份一般都還是用 jq 處理。

不過 curl 自己又定義了一套指定 JSON 內容的方式...