Software – Page 4 – Gea-Suan Lin's BLOG

uv：用 Rust 寫的 Python Packaging 替代方案

社群好幾個地方都有提到的「uv: Python packaging in Rust」這個，文章開頭的說明有快速說明目標是 pip 的 drop-in replacement：

TL;DR: uv is an extremely fast Python package installer and resolver, written in Rust, and designed as a drop-in replacement for pip and pip-tools workflows.

這跟「Ruff：用 Rust 寫的 Python Linter」都是 Astral 下的專案，主打用 Rust 改善速度的專案。

馬上想到的是 package resolver，這指的是依照每個套件指定的相依條件，找出符合所有條件的版本組合。

這在各個語言的套件系統上都是痛點，而在「Dependency hell is NP-complete」這篇就有指出這是 NP-complete。

這是因為 3SAT 問題可以 PTIME 轉成 package resolver 問題 (於是就 NP-hard 了)，再加上有 PTIME 的驗證，就變成 NP-complete 了。

但看說明應該是不只這個部分，包括了一些 i/o 類操作的改善。

除了速度以外，uv 也提供了讓測試更方便的功能，像是在計算相容版本時，預設的演算法是儘量都裝最新版，但你可以指定要儘量裝最舊的版本，這樣對於相容性測試頗有用的：

But by passing --resolution=lowest, library authors can test their packages against the lowest-compatible version of their dependencies. (This is similar to Go's Minimal version selection.)

這個工具的出現也是頗有幫助，我記得寫 Python 專案時隨便引入個 Django，再多拉幾個套件，跑起 package resolver 就要花不少時間了，可以想像中大型專案在這塊的痛點...

另外剛剛回去看了 ruff，從去年四月 500+ 條規則增加到 700+ 條了，在發表受到注目後應該補了不少社群常用到的規則，說不定新專案可以無痛跳進去了，去年的時候試著用，有發現常見的規則還沒有支援...

PHP 8.3 相比於 PHP 8.2 的效能提升

找資料的時候意外發現 PHP 8.3 相對於 PHP 8.2 的效能提升好像不算小？目前看到這兩個地方有提到：

前面那篇的 benchmark 數據可以看出來愈大愈複雜的框架，提升的效能就愈多：

乾淨的 WordPress 從 158 rps 成長到 169 rps，大約 7% 的增加。
如果是 WooCommerce 的話從 49 rps 到 58 rps，大約是 18.4%。
接著 Laravel 則是從 670 rps 到 925 rps，提升了 38.1%。
而 Drupal 則是 941 rps 到 1432 rps，提升了 52.2%。

在「Make your app faster with PHP 8.3」這邊提到了 PHP 8.3 改善了很多關於效能的項目。

首先提到的是 JIT 的改善：

The Just-In-Time (JIT) compiler has been further optimized for better efficiency. The execution of scripts is faster and consumes less CPU time. This is especially beneficial for resource-intensive tasks.

然後是 opcode 這邊的改善：

PHP has refined how it handles opcodes (the instructions in the PHP bytecode). Version 8.3 uses more efficient ways to interpret and execute these opcodes. This reduces the execution time of scripts.

然後 GC 機制也改善了：

PHP 8.3 enhances the garbage collection mechanism, which is responsible for freeing memory occupied by unused objects. This results in more efficient memory usage and can significantly improve performance for memory-intensive applications.

array 的改善：

Other improvements include optimizations for handling arrays and an enhanced type system.

對於複雜的應用就很容易都受惠，然後就有頗大的提升...

展開所有 GitHub comment 的 Bookmarklet

看到 Eric Meyer 弄了一個可以展開 GitHub comment 的 bookmarklet：「Bookmarklet: Load All GitHub Comments」。

分析他的程式碼，稍微手動排一下，可以看出來邏輯蠻簡單的，就是去找出對應的 button，然後模擬按下去的 event：

javascript:function start() {
  let buttons = document.querySelectorAll('button');
  let loaders = [];
  for (let i = 0; i < buttons.length; i += 1) {
    if (buttons[i].textContent.trim() == 'Load more%E2%80%A6') {
      loaders.push(buttons[i]);
      buttons[i].dispatchEvent(new MouseEvent('click', {
        view: window,
        bubbles: false
      }))
    }
  }
  if (loaders.length > 0) {
    setTimeout(start, 5000)
  }
}
setTimeout(start, 500);
void(20240130);

意外可以看到一些應該是作者以前寫習慣的寫法 (畢竟 Eric Meyer 這個名字二十年前就聽過了？)，現在 for 拿來當 iteration 應該會用 of 的語法了，另外是 let 與 const 的差異...

還好這邊還是用 querySelectorAll()，而不是直接看到 getElementsByTagName()，不然就更有考古感了...

拿「Quadratic time internal base conversions #90716」這個測了一下還行，不過我不是那麼常用到，大概不會掛到 bookmark bar 上面...

作者有提到考慮過寫成 userscript，不過看起來是懶 XD

Ubuntu 的 Phased Update

在 Ubuntu 22.04 上面常常會遇到跑 apt upgrade 時系統跟你說有些 package 不打算升級：

$ sudo apt upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages have been kept back:
  python3-distupgrade ubuntu-release-upgrader-core ubuntu-release-upgrader-gtk
0 upgraded, 0 newly installed, 0 to remove and 3 not upgraded.

以往遇到這種情況，如果確定要裝就是開 dist-upgrade 下去，但會發現也還是不為所動：

$ sudo apt full-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages have been kept back:
  python3-distupgrade ubuntu-release-upgrader-core ubuntu-release-upgrader-gtk
0 upgraded, 0 newly installed, 0 to remove and 3 not upgraded.

這個可能是遇到 PhasedUpdates，這個設計是在最後一關推出去的階段，一次不要更新 100% 的機器。

可以從 apt policy 看到現在的比率是 20%：(這是我已經升級上去的樣子)

$ apt policy python3-distupgrade
python3-distupgrade:
  Installed: 1:22.04.18
  Candidate: 1:22.04.18
  Version table:
 *** 1:22.04.18 500 (phased 20%)
        500 http://tw.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
        500 http://tw.archive.ubuntu.com/ubuntu jammy-updates/main i386 Packages
        100 /var/lib/dpkg/status
     1:22.04.10 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy/main amd64 Packages
        500 http://tw.archive.ubuntu.com/ubuntu jammy/main i386 Packages

而對於 LTS 的使用者，這個功能在 APT 的支援是從 Ubuntu 22.04 開始，以前只有桌面的 Update Manager 才有支援，所以不太會遇到：

Up to Focal (20.04), Update Manager is the only package manager that supports phased updates (reference). Any other update mechanism installs all updates regardless of the Phased-Update-Percentage.

(話說 wiki 頁上面可以看到有「User stories」這段，用 User story 的格式把這個功能的目的描述出來了)

所以一般人可以忽略掉，而對於有意願想要幫忙測試的人，也可以透過設定蓋過 Phased Updates 的比例設定...

nginx 分家：freenginx

在 Hacker News 上看到 Maxim Dounin 決定分家到 freenginx 的消息：「Freenginx: Core Nginx developer announces fork (nginx.org)」，原文在 mailing list 上：「announcing freenginx.org」，這邊提到分家的原因：

Unfortunately, some new non-technical management at F5 recently decided that they know better how to run open source projects. In particular, they decided to interfere with security policy nginx uses for years, ignoring both the policy and developers’ position.

在 freenginx 的 mailing list 上有提到更多，在 2024-February/000007.html 這篇：

The most recent "security advisory" was released despite the fact that the particular bug in the experimental HTTP/3 code is expected to be fixed as a normal bug as per the existing security policy, and all the developers, including me, agree on this.

And, while the particular action isn't exactly very bad, the approach in general is quite problematic.

這邊提到的 security advisory 是「[nginx-announce] nginx security advisory (CVE-2024-24989, CVE-2024-24990)」這個，看起來是個沒有 enabled by default 的功能：

Two security issues were identified in nginx HTTP/3 implementation,
which might allow an attacker that uses a specially crafted QUIC session
to cause a worker process crash (CVE-2024-24989, CVE-2024-24990) or
might have potential other impact (CVE-2024-24990).

The issues affect nginx compiled with the ngx_http_v3_module (not
compiled by default) if the "quic" option of the "listen" directive
is used in a configuration file.

The issue affects nginx 1.25.0 - 1.25.3.
The issue is fixed in nginx 1.25.4.

在 id=39373804 這邊有些目前 nginx 組成的資訊可以讀，目前 nginx 的 core devs 應該就三位 (在 Insights/Contributors 這邊看起來只有兩位，這是因為 GitHub 上面的 mirror 看起來是從 Mercurial 同步過去的，而 Sergey Kandaurov 沒有 GitHub 帳號)：

Worth noting that there are only two active "core" devs, Maxim Dounin (the OP) and Roman Arutyunyan. Maxim is the biggest contributor that is still active. Maxim and Roman account for basically 99% of current development.

So this is a pretty impactful fork. It's not like one of 8 core devs or something. This is 50% of the team.

Edit: Just noticed Sergey Kandaurov isn't listed on GitHub "contributors" because he doesn't have a GitHub account (my bad). So it's more like 33% of the team. Previous releases have been tagged by Maxim, but the latest (today's 1.25.4) was tagged by Sergey

現在就是單方面的說法，可以再讓子彈多飛一點時間... 看 F5 要不要回應，以及 F5 的說法 (如果要回應的話)。

讓 IntelAMD GPU 直接跑 CUDA 程式的 ZLUDA

先前提過「在 Intel 內顯上面直接跑 CUDA 程式的 ZLUDA」，結果後來事情大翻轉，AMD 跑去贊助專案，變成支援 AMD GPU 了：「AMD Quietly Funded A Drop-In CUDA Implementation Built On ROCm: It's Now Open-Source」，專案在 GitHub 的 vosen/ZLUDA 這邊，而這包支援 AMD GPU 的 commit log 則是在 1b9ba2b2333746c5e2b05a2bf24fa6ec3828dcdf 這包巨大的 commit：

Nobody expects the Red Team

Too many changes to list, but broadly:
* Remove Intel GPU support from the compiler
* Add AMD GPU support to the compiler
* Remove Intel GPU host code
* Add AMD GPU host code
* More device instructions. From 40 to 68
* More host functions. From 48 to 184
* Add proof of concept implementation of OptiX framework
* Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML
* Improve ZLUDA launcher for Windows

其中的轉折以及後續的故事其實還蠻不知道怎麼說的... 作者一開始在 Intel 上班，弄一弄 Intel 覺得這沒前景，然後 AMD 接觸後贊助這個專案，到後面也覺得沒前景，於是依照後來跟 AMD 的合約，如果 AMD 覺得沒前景，可以 open source 出來：

Why is this project suddenly back after 3 years? What happened to Intel GPU support?

In 2021 I was contacted by Intel about the development od ZLUDA. I was an Intel employee at the time. While we were building a case for ZLUDA internally, I was asked for a far-reaching discretion: not to advertise the fact that Intel was evaluating ZLUDA and definitely not to make any commits to the public ZLUDA repo. After some deliberation, Intel decided that there is no business case for running CUDA applications on Intel GPUs.

Shortly thereafter I got in contact with AMD and in early 2022 I have left Intel and signed a ZLUDA development contract with AMD. Once again I was asked for a far-reaching discretion: not to advertise the fact that AMD is evaluating ZLUDA and definitely not to make any commits to the public ZLUDA repo. After two years of development and some deliberation, AMD decided that there is no business case for running CUDA applications on AMD GPUs.

One of the terms of my contract with AMD was that if AMD did not find it fit for further development, I could release it. Which brings us to today.

這個其實還蠻好理解的，CUDA 畢竟是 Nvidia 家的 ecosystem，除非你反超越後自己定義一堆自家專屬的功能 (像是當年 Microsoft 在 IE 上的玩法)，不然只是幫人抬轎。

Phoronix 在 open source 前幾天先拿到軟體進行測試，而他這幾天測試的結果給了「頗不賴」的評價：

Andrzej Janik reached out and provided access to the new ZLUDA implementation for AMD ROCm to allow me to test it out and benchmark it in advance of today's planned public announcement. I've been testing it out for a few days and it's been a positive experience: CUDA-enabled software indeed running atop ROCm and without any changes. Even proprietary renderers and the like working with this "CUDA on Radeon" implementation.

另外為了避免測試時有些測試軟體會回傳到伺服器造成資訊外洩，ZLUDA 在這邊故意設定為 Graphics Device，而在這次 open source 公開後會改回正式的名稱：

In my screenshots and for the past two years of development the exposed device name for Radeon GPUs via CUDA has just been "Graphics Device" rather than the actual AMD Radeon graphics adapter with ROCm. The reason for this has been due to CUDA benchmarks auto-reporting results and other software that may have automated telemetry, to avoid leaking the fact of Radeon GPU use under CUDA, it's been set to the generic "Graphics Device" string. I'm told as part of today's open-sourcing of this ZLUDA on Radeon code that the change will be in place to expose the actual Radeon graphics card string rather than the generic "Graphics Device" concealer.

作者的測試看起來在不同的測試項目下差異頗大，但如果依照作者的計算方式，整體效能跟 OpenCL 版本差不多：

Phoronix 那邊則是做了與 Nvidia 比較的測試... 這邊拿的是同樣都有支援 Nvidia 與 AMD 家的卡的 Blender 測試，然後跑出來的結果讓人傻眼，透過 ZLUDA 轉譯出來的速度比原生支援的速度還快，這 optimization 看起來又有得討論了：(這是 BMW27 的測試，在 Classroom 的測試也發現一樣的情況)

但即使如此，CUDA over AMD GPU 應該還是不會起來，官方會儘量讓各 framework 原生支援，而大多數的開發者都是在 framework 上面開發，很少會自己從頭幹...

VirtualBox 內的 Windows 上傳速度很慢的問題

因為我電腦有兩張網卡，兩條線分別接到自己拉的 HiNet 以及社區網路 (不過出去也是 HiNet，這是另外一回事了)。

我桌機的預設 routing 是走自己拉的 HiNet，但我希望 VM 是走社區網路，所以用 bridge mode 設定到網卡上，用 DHCP 取得分享器給的 private IP。

之前一直都沒注意到，前幾天用 Line 傳照片的時候很慢 (之前就有發生了，一直忘記去追問題)，花了點時間追問題的時候發現是 VM 裡面的 Windows 10 上傳很慢，這點可以從 Speedtest 的測試結果看到：

先講最後的結論，在交叉測了很多組合後，我發現遇到的問題是把網卡裡的 Large Send Offload (IPv4) (也就是 LSO) 從 Enabled 改成 Disabled：

回到當時抓問題的情況，當時先用筆電與 host 測試都沒看到問題，所以看起來應該是 VM 裡面的狀況，但不確定是什麼情況，畢竟不是斷掉...

由於下載速度正常，只有上傳速度卡住，一開始想到的是跟 MTU 相關的問題，所以找了指令降到 1400 後測試，還是一樣...

後來先把 VM 的網路改成 NAT，再測試上傳速度就正常了...

接著想要換個網路卡類型看看，結果卡在找不到 driver。

本來已經想拿 tcpdump 出來追了，但想說先去看看 Windows 10 網卡設定裡面的設定，結果看到 LSO... 就先關看看 (算是以前在 FreeBSD 以及 Linux 下的經驗？)。

然後一關就正常了，交叉再開關兩次確認這個參數有影響，就肯定這個 workaround 應該是有效了...

另外在自己找完問題後，在「Virtualbox 7.0.12 slow upload speed in any Guest OS」這邊看到了類似的問題以及同樣的 workaround。

LSO 過了十幾年還是...

VirtualBox 的 KVM backend 版本

看到「VirtualBox KVM Public Release (cyberus-technology.de)」這邊的討論，原文是「VirtualBox KVM public release」，專案則是在 GitHub 上的 cyberus-technology/virtualbox-kvm 這邊。

這個算是解決了 VirtualBox 在 Linux 上常遇到的問題：當使用 VirtualBox 時無法同時使用 KVM，像是 qemu-kvm 這樣的工具。

不過看起來是直接大改 VirtualBox，而不是補一個 extension 或是 plugin 的感覺，雖然說明現有的 guest OS 可以直接套用。

沒有 pre-compiled binary，需要自己編，而且目前的版本得用 Ubuntu 22.04 內的 GCC 11 編譯，裝了新版的 GCC 12 會有狀況：

Newer GCC versions (>= 12) might cause build issues.

另外目前的主要測試的平台還是以 Intel 為主，AMD 這邊是「會動」但沒有詳細測過：

Currently, Intel x86_64 is the only supported host platform.
AMD will most likely work too but is considered experimental at the moment.

然後在比較新的 Intel 平台上，Linux kernel 有些東西要開機參數調：

Starting with Intel Tiger Lake (11th Gen Core processors) or newer, split lock detection must be turned off in the host system. This can be achieved using the Linux kernel command line parameter split_lock_detect=off or using the split_lock_mitigate sysctl.

看到編譯參數裡面的 --disable-hardening，hmmm... 先繼續放著看看？

jQuery 4.0.0 Beta (啊，居然)

看到 jQuery 的公告「jQuery 4.0.0 BETA!」這篇，有種「啊，居然」的感覺冒出來...

要注意這個版本放掉了 IE10 以及更早的版本，但還是有支援 IE11，目前計畫到 jQuery 5.0 才會拔掉：

jQuery 4.0 drops support for IE 10 and older. Some may be asking why we didn’t remove support for IE 11. We plan to removes support in stages, and the next step will happen in jQuery 5.0.

另外一個頗大的改變是 source 端改成 ES module 了，這樣對 jQuery 團隊開發上應該會方便不少 (很多現代工具的引入)：

jQuery source migrated to ES modules

It was a special day when the jQuery source on the main branch was migrated from AMD to ES modules. The jQuery source has always been published with jQuery releases on npm and GitHub, but could not be imported directly as modules without RequireJS, which was jQuery’s build tool of choice. We have since switched to Rollup for packaging jQuery. And we also run tests on the ES modules before packaging them.

這有種古蹟改建的味道啊...

Jetpack 的備份功能失效

發現 1/13 後 Jetpack 就會一直發信通知備份功能失效，像是這樣：

連到 WordPress.com 上把語系改成英文，抓了一下問題發現錯誤訊息是：

We could not back up your site because it appears to be offline
Backup failed

用這個當關鍵字去找可以找到這篇：「13.0 and backups?」，照著裡面的方法把 Jetpack 降版到 12.9.3 後，隔天再看備份就正常了...

先把版本卡在 12.9.3，之後有新版再測試新版...