In more conrete numbers, testing with Geekbench 6 shows that splitting into four emulated NUMA nodes can uplift the single core score of the benchmark by around 6%, and the multi-core by around 18%.
看起來用到 C 的實作 (大多數應該是為了加速,或是需要底層的 syscall) 都會需要注意一下。
另外值得一提的是目前拔掉 GIL 的版本在 single threading 的情況下會比標準版的 Python 3 慢很多,會需要後續再改進,所以一開始的版本就算穩定下來也可能沒有經濟上的效益,參考 id=40949628 這邊提到的:
Right now there is a significant single-threaded performance cost. Somewhere from 30-50%. Part of what my colleague Ken Jin and others are working on is getting back some of that lost performance by applying some optimizations. Expect single-threaded performance to improve for Python 3.14 next year.
The author says they don't believe that a lighter version has been shown to reduce engagement.
I, on the other hand, fully believe that.
The recommended lite-youtube-embed project page has a demo of both lite and regular players [0], and the lite version takes noticeably longer to start playing the video.
Every additional millisecond of load time will reduce engagement, and here the difference is more on the order of hundreds of milliseconds or more.
That still irks me. The real problem is not tinygram prevention. It's ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful. Unfortunately by the time I found about delayed ACKs, I had changed jobs, was out of networking, and doing a product for Autodesk on non-networked PCs.
然後翻了一下 John Nagle 的 Hacker News 帳號,看起來還蠻活躍的?常常 comment 一些東西...
To this day, my most public contribution to reddit is that I wrote the code to put the title of the post in the URL. That was done specifically for SEO purposes.
這個在 Google Webmasters (現在叫做 Google Search Console) 也針對 Reddit 處理,將速率強制設為 Custom,不讓 Reddit 的人改:
It was pretty much the only SEO optimization we ever did (along with a few DOM changes), because shortly after that, Google basically dedicated engineering effort specifically to crawling reddit. So much so that we lost the "crawl rate" button in our SEO admin page on Google, it was just set to "Custom".
I had to stand up a fleet of app servers and caches and databases, and change the load balancers so that Google basically had their own infrastructure (although we would shunt all crawlers there). Crawler traffic was very different than regular traffic -- it looked at pages more than two days old, something humans rarely did at the time. It would blow out every cache (memory, database, disk, etc.). So we put them on their own infra to keep them from killing the rest of the site.
In both our GRPC and DGS Framework services, GC pauses are a significant source of tail latencies.
That’s particularly true of our GRPC clients and servers, where request cancellations due to timeouts interact with reliability features such as retries, hedging and fallbacks.
ZGC has a fixed overhead 3% of the heap size, requiring more native memory than G1. Except in a couple of cases, there’s been no need to lower the maximum heap size to allow for more headroom, and those were services with greater than average native memory needs.
第四張則是 Huge Pages 的差異,這邊要注意這張圖的 Y 軸不是從 0 計算:
可以看到在開 Huge Pages 後,在 RPS (request per second) 不變的情況下 CPU 使用率是有下降的,大約從 50% 降到 45% 左右,不過這張圖的時間跨度有點少,應該是要拉長一點的圖... 不過既然被提出來了,就假設 Netflix 內看起來應該是有這個趨勢,只是抓圖的時候懶了點?