Ruby 3.3 的速度再次提升

在「Ruby 3.3's YJIT Runs Shopify's Production Code 15% Faster」這篇提到了 Ruby 3.3 的速度再次提升的消息。

Shopify 上面的測試,3.3.0-preview2 的速度已經比 3.2.2 (兩者都有開啟 YJIT) 快了 13%,而且 p50/p90/p99 都有對應的改善:

不過有提到一些要注意的點,像是記憶體的用量又會再更高 (本來開 YJIT 的時候就已經有增加了),如果是對記憶體比較敏感的環境,會需要注意這點:

Since Ruby 3.3.0-preview2 YJIT generates more code than Ruby 3.2.2 YJIT, this can result in YJIT having a higher memory overlead. We put a lot of effort into making metadata more space-efficient, but it still uses more memory than Ruby 3.2.2 YJIT. We’re looking into skipping compilation of paths that are less frequently executed.

但 server 端應該是還好 (記憶體給多一點),整體是個可以期待的方向...

Ruby 再引入另外一套 JIT 實做:RJIT

Hacker News Daily 上看到「RJIT #7448」這個,Ruby 上一套新的 JIT 實做。

這次的 RJIT 取代掉先前的 MJIT:

This PR replaces the current implementation of MJIT with a new JIT called "RJIT"

有些特點,其中一個是 RJIT 在 buildtime 與 runtime 都不需要 compiler,這是因為 RJIT 直接用 Ruby 實做:

RJIT uses a pure-Ruby assembler to generate native code

  • MJIT requires a C compiler at runtime. YJIT requires a Rust compiler at build time. RJIT doesn't require them.
  • This means that RJIT's warmup could be slower than YJIT, but it's still much faster than MJIT's.

另外值得注意的是,RJIT 的作者 k0kubunYJIT 的作者 Maxime Chevalier-Boisvert 都是 Shopify 的員工,可以看出 Shopify 對於 Ruby 效能的痛?決定直接自己養人改善效能。

回到 RJIT 這邊跑的測試,可以看到他是用 YJIT 的測試套件測,這也就不會太奇怪了。

跟這次取代掉的 MJIT 相比,RJIT 在 Headline 這包測試都 OK,在 Other 這包則是有來有回,而在 Micro 這包則是有不少項目輸掉 (相比於前兩者):

這樣整體看起來算是有進步,下一版 Ruby 更新應該就會有了。

Ruby 3.2.0 把 YJIT 列為穩定功能了

去年有寫過 RubyYJIT 帶來的效能提昇:「YJIT 帶給 Ruby 大量的效能提昇」。

在這次的 Ruby 3.2.0 發布就把 YJIT 列為穩定功能了:「Ruby 3.2.0」。

  • YJIT is no longer experimental
    • Has been tested on production workloads for over a year and proven to be quite stable.

另外就是支援的平台,看起來是多了 arm64 這邊的支援,所以馬上列表就多了一堆新機器:

  • YJIT now supports both x86-64 and arm64/aarch64 CPUs on Linux, MacOS, BSD and other UNIX platforms.
    • This release brings support for Apple M1/M2, AWS Graviton, Raspberry Pi 4 and more.

另外是每個程式語言幾乎都會遇到的 regexp 類的問題,這次 Ruby 3.2.0 利用 Memoization 的方式降低某些 regexp 的消耗:

# This match takes 10 sec. in Ruby 3.1, and 0.003 sec. in Ruby 3.2
/^a*b?a*$/ =~ "a" * 50000 + "x"

而另外一組 regexp 也可以看出類似的效果:

用一些記憶體空間換取效能,降低被 DoS 的一些機會。另外一方面,引入了 regexp timeout 的 workaround,緩解真的被打的時候的資源消耗上限:

The optimization above cannot be applied to some kind of regular expressions, such as those including advanced features (e.g., back-references or look-around), or with a huge fixed number of repetitions. As a fallback measure, a timeout feature for Regexp matches is also introduced.

YJIT 帶給 Ruby 大量的效能提昇

Hacker News 首頁上看到的消息,由 Shopify 贊助的 YJIT 被 Ruby 官方接受了:「Merge YJIT: an in-process JIT compiler (」。

YJIT currently provides average speedups of 23% over the CRuby interpreter on realistic benchmarks, and near-instant warm-up time.

實做 YJIT 的 Maxime Chevalier-Boisvert 在他自己的 blog 上有提到這次的實做:「YJIT: Building a New JIT Compiler Inside CRuby」,裡面選擇的方法是他的 PhD 論文:「Simple and Effective Type Check Removal through Lazy Basic Block Versioning」。


That being said, according to our benchmarks, we’ve been able to achieve speedups over the CRuby interpreter of 7% on railsbench, 19% on liquid template rendering, and 19% on activerecord.

Currently, only about 50% of instructions in railsbench are executed by YJIT, and the rest run in the interpreter, meaning that there is still a lot we can do to improve upon our current results.

本來的 MJIT 看起來會慢慢淡出...