AMD Ryzen Threadripper 3990X 在 Windows 上的效能

John Carmack 注意到在 AMD Ryzen Threadripper 3990X 上因為 Windows 的 group limit 限制而造成效能問題:

但這點可以透過打散到兩個 group 改善 (workaround) 而提昇速度:

然後順便看了一下目前 CPU Benchmark 網站上對於高階 CPU 的跑分數據「PassMark - CPU Mark High End CPUs)」,可以看到 AMD 最近真是香噴噴的,用 3950X (16C/32T,105W) 殺 Intel 目前最高分的 W-3275M (28C/56T,205W),然後那個價差:

Intel 的 14nm 牙膏繼續擠...

EC2 要從 Instance 數量限制改成 vCPU 數量限制

這算是 AWS 的保護機制,在 Amazon EC2 上能開的機器數量都是有限制的。

打算要用新的 vCPU 數量限制取代舊的 Instance 數量限制:「Using new vCPU-based On-Demand Instance limits with Amazon EC2」,然後現在可以先加入:「vCPU-based On-Demand Instance Limits are Now Available in Amazon EC2」。

這次改善的問題是,以往 m5.largem5.xlarge 是兩個不同的限制,所以用起來會比較卡,現在則改成用 vCPU 來管理。

這次的架構是改成,一般性的機器會有一個 vCPU 數量限制,其他不同特性的各自有自己的 vCPU 數量限制:

In addition to now measuring usage in number of vCPUs, there will only be five different On-Demand Instance limits—one limit that governs the usage of standard instance families such as A, C, D, H, I, M, R, T, and Z, and one limit per accelerated instance family for FPGA (F), graphic-intensive (G), general purpose GPU (P), and special memory optimized (X) instances.

9/24 可以先手動加入,會拿你現在的量會換算過去,然後 10/24 會全部都轉過去:

During a transition period from September 24, 2019, through October 24, 2019, you can opt in to receive vCPU-based instance limits. When you opt in, EC2 automatically computes your new limits, giving you access to launch at least the same number of instances (if not more) than you do currently. Beginning October 24, 2019, all accounts will switch to vCPU-based instance limits, and the current count-based instance limits will no longer be supported. Although the switchover will not impact your ability to launch EC2 instances, you should familiarize yourself with the new On-Demand Instance limits experience and opt into vCPU limits at a time of your choosing.

應該是會方便一些...

Stripe 遇到 AWS 上 DNS Resolver 的限制

當量夠大就會遇到各種限制...

這次 Stripe 在描述 trouble shooting 的過程:「The secret life of DNS packets: investigating complex networks」。

其中一個頗有趣的架構是他們在每台主機上都有跑 Unbound,然後導去中央的 DNS Resolver,再決定導去 Consul 或是 AWS 的 DNS Resolver:

Unbound runs locally on every host as well as on the DNS servers.

然後他們發現偶而會有大量的 SERVFAIL

接下來就是各種找問題的過程 (像是用 tcpdump 看情況,然後用 iptables 統計一些數字),最後發現是卡在 AWS 的 DNS Resolver 在 60 秒內只回應了 61,385 packets,換算差不多是 1,023 packets/sec,這數字看起來就很雷:

During one of the 60-second collection periods the DNS server sent 257,430 packets to the VPC resolver. The VPC resolver replied back with only 61,385 packets, which averages to 1,023 packets per second. We realized we may be hitting the AWS limit for how much traffic can be sent to a VPC resolver, which is 1,024 packets per second per interface. Our next step was to establish better visibility in our cluster to validate our hypothesis.

在官方文件「Using DNS with Your VPC」這邊看到對應的說明:

Each Amazon EC2 instance limits the number of packets that can be sent to the Amazon-provided DNS server to a maximum of 1024 packets per second per network interface. This limit cannot be increased. The number of DNS queries per second supported by the Amazon-provided DNS server varies by the type of query, the size of response, and the protocol in use. For more information and recommendations for a scalable DNS architecture, see the Hybrid Cloud DNS Solutions for Amazon VPC whitepaper.

iptables 看到的量則是:

找到問題後,後面就是要找方法解決了... 他們給了一個只能算是不會有什麼副作用的 workaround,不過也的確想不到太好的解法。

因為是查詢 10.0.0.0/8 網段反解產生大量的查詢,所以就在各 server 上的 Unbound 上指定這個網段直接問 AWS 的 DNS Resolver,不需要往中央的 DNS Resolver 問,這樣在這個場景就不會遇到 1024 packets/sec 問題了 XDDD

修改 Firefox 的 Pop Up 限制

這是在 Bazqux 使用時遇到的問題,我在 Bazqux 這邊看到想看的文章,會習慣用 c 鍵打開新的 tab,等下再一起看。在 Chrome 上面需要安裝 extension 才能開到背景,在 Firefox 上可以透過修改 about:config 裡的browser.tabs.loadDivertedInBackground,讓他預設開到背景 (雖然變成所有的行為都會到開到背景 tab,但我偏好這樣...)。

但在 Bazqux 用 c 鍵把連結打開到 tab 時,有時發現會跳出遇到開啟上限:

這其實很不方便... 所以故意連打測試發現是 20 個,找了一下資料發現是 dom.popup_maximum 這個值,改成 -1 就好了。這算是某種安全機制吧...

先繼續用看看 :o

Stripe 將 Redis 單機版轉到 Cluster 版本上降低了錯誤率

在「Scaling a High-traffic Rate Limiting Stack With Redis Cluster」這邊提到了 StripeRedis 單機版轉移到 10 個節點的 cluster 版本,然後錯誤率大幅下降:

Stripe’s rate limiters are built on top of Redis, and until recently, they ran on a single very hot instance of Redis. The server had followers in place for failover, but at any given time, one node was handling every operation.

We eventually solved it by migrating to a 10-node Redis Cluster.

另外也可以看出來,在轉移到 cluster 版本後有不少要注意的,像是因為 sharding 而需要調整平衡性。另外是 cluster 模式下寫入的 confirmation 跟一般預期的不太一樣,不過這對於 rate limit 的應用還好,可以接受某種程度的掉資料...

Amazon RDS 支援更大的硬碟空間與更多的 IOPS

Amazon RDS 的升級:「Amazon RDS Now Supports Database Storage Size up to 16TB and Faster Scaling for MySQL, MariaDB, Oracle, and PostgreSQL Engines」。

空間上限從 6TB 變成 16TB,而且可以無痛升。另外 IOPS 上限從 30K 變成 40K:

Starting today, you can create Amazon RDS database instances for MySQL, MariaDB, Oracle, and PostgreSQL database engines with up to 16TB of storage. Existing database instances can also be scaled up to 16TB storage without any downtime.

The new storage limit is an increase from 6TB and is supported for Provisioned IOPS and General Purpose SSD storage types. You can also provision up to 40,000 IOPS for Provisioned IOPS storage volumes, an increase from 30,000 IOPS.

不過隔壁的 Amazon Aurora 還是大很多啊 (64TB),而且實際上不用管劃多大,他會自己長大:

Q: What are the minimum and maximum storage limits of an Amazon Aurora database?

The minimum storage is 10GB. Based on your database usage, your Amazon Aurora storage will automatically grow, up to 64 TB, in 10GB increments with no impact to database performance. There is no need to provision storage in advance.

用 4.5+ 的 Linux Kernel 限制 I/O 速度

在「Using cgroups to limit I/O」這邊看到作者試著用 cgroups 限制 I/O 速度。

作者前面花了不少篇幅解釋 cgroups v1 無法正確限制 I/O 速度,後面就在講 cgroups v2 怎麼做:

So, in order to limit I/O when this I/O may hit the writeback kernel cache, we need to use both memory and io controllers in the cgroups v2!

這會需要 4.5+ 的 kernel,可能會需要手動更新,或是直接使用比較新的 distribution:

Since kernel 4.5, the cgroups v2 implementation was marked non-experimental.

然後照抄就可以了 (不過這邊的指定都需要 root,作者用 $ 表示 shell 有點怪):

# mount -t cgroup2 nodev /cgroup2
# mkdir /cgroup2/cg2
# echo "+io" > /cgroup2/cgroup.subtree_control
# echo "8:0 wbps=1048576" > io.max
# echo $$ > /cgroup2/cg2/cgroup.procs

然後就可以跑 dd 測試速度了,同時間也可以跑 iostat 看。

Twitter 打算放寬到 280 字...

Twitter 打算放寬 140 字限制:「Giving you more characters to express yourself」。

不過不包括日文、中文與韓文 XD

We want every person around the world to easily express themselves on Twitter, so we're doing something new: we're going to try out a longer limit, 280 characters, in languages impacted by cramming (which is all except Japanese, Chinese, and Korean).

然後也拿日文與英文當範例:

然後做了比較:

GitHub Apps (前身 GitHub Integrations) 的 Rate Limiting 變得更彈性

GitHub 宣佈了把 GitHub Integrations 改名為 GitHub Apps,另外 Rate Limiting 變得更彈性:「GitHub Apps (formerly Integrations) General Release」。

All GitHub Apps start with a rate limit of 5000 requests per hour. To facilitate growth we have added a dynamic rate limit for installations: organization installations with more than 20 users receive another 50 requests per hour for each user. Installations that have more than 20 repositories receive another 50 requests per hour for each repository.

所以對於超過 20 個使用者以及超過 20 個 repository 的 organization 都會增加 Rate Limiting。每個單位增加 50 requests/hour 不算太多,不過想一下好像也不少... (尤其對大一點的團體來說)