Linux Kernel 4.20 修正了一卡車 Intel CPU bug,然後效能掉光了...

看到「Bisected: The Unfortunate Reason Linux 4.20 Is Running Slower」這篇測試了目前還在 RC 的 4.20.0,可以看到 AMD 的效能沒有太大影響,但 Intel i9 的效能掉了很嚴重:

從說明可以看到有測出 30%~50%:

This ranged from Rodinia scientific OpenMP tests taking 30% longer to Java-based DaCapo tests taking up to ~50% more time to complete to code compilation tests taking measurably longer to lower PostgreSQL database server performance to longer Blender3D rendering times.

另外在其他 Intel CPU 上測試也發現不是只有 i9 有影響,低階的機器也是:

Those affected systems weren't high-end HEDT boxes but included a low-end Core i3 7100 as well as a Xeon E5 v3 and Core i7 systems.

透過 bisect 有找到是哪個 commit 造成的:

That change is "STIBP" for cross-hyperthread Spectre mitigation on Intel processors. STIBP is the Single Thread Indirect Branch Predictors (STIBP) allows for preventing cross-hyperthread control of decisions that are made by indirect branch predictors.

但這又是屬於 security patch,不太能關... 加上自從 MeltdownSpectre 後,讓安全研究人員發現了全新的天地,之後應該只會愈來愈慘 :o

Amazon Elasticsearch 支援 I3 instance (i.e. 1.5 PB Disk) 了

Amazon Elasticsearch 支援 I3 instance 了:「Run Petabyte-Scale Clusters on Amazon Elasticsearch Service Using I3 instances」。

Amazon Elasticsearch Service now supports I3 instances, allowing you to store up to 1.5 petabytes of data in a single Elasticsearch cluster for large log analytics workloads.

i3.16xlarge 單台是 15.2 TB 的硬碟空間,100 台就會是 1.5 PB,不知道跑起來會多慢 XDDD

Amazon Elasticsearch Service – Amazon Web Services (AWS) | FAQs 這邊還沒修正 XD:

You can request a service limit increase up to 100 instances per domain by creating a case with the AWS Support Center. With 100 instances, you can allocate about 150 TB of EBS storage to a single domain.

Percona 分析在 AWS 上跑 Percona XtraDB Cluster 的效能 (I/O bound)

Percona 的人分析了在 Amazon EC2 上跑 Percona XtraDB Cluster (PXC) 效能 (I/O bound):「Best Practices for Percona XtraDB Cluster on AWS」。

先看他們做出來的圖:

直接跳到結論的地方。如果資料可以掉,用 i3 本地 storage 的效能是最好的,如果要資料不能掉,用 EBS 的 Provisioned IOPS SSD (io1) 的效能會比 General Purpose (gp2) 好很多。

另外 instance type 的選擇上,避免用 {i3,r4}.large,因為測試出來發現 {i3,r4}.xlarge 的效能好不只一倍。

不過 Aurora 的 Multi-master 已經在 Preview 了啊,如果 Percona 的人拿到帳號的話,應該會有單位成本的效能比較可以看...

Amazon EC2 推出第一款 Bare Metal 的 Instance

Amazon EC2 直接租整台主機出來了:「Amazon EC2 Bare Metal Instances with Direct Access to Hardware」。

Bare Metal 怎麼翻譯比較好啊?雖然知道是拔掉虛擬化的主機... 裸奔機?

We knew that other customers also had interesting use cases for bare metal hardware and didn’t want to take the performance hit of nested virtualization. They wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.

反正這種機器就是要壓榨整台機器的效能,所以不會拿小台機器出來給大家玩。這次推出的是 i3 系列,叫做 i3.metal

Today we are launching a public preview the i3.metal instance, the first in a series of EC2 instances that offer the best of both worlds, allowing the operating system to run directly on the underlying hardware while still providing access to all of the benefits of the cloud. The instance gives you direct access to the processor and other hardware, and has the following specifications:

Processing – Two Intel Xeon E5-2686 v4 processors running at 2.3 GHz, with a total of 36 hyperthreaded cores (72 logical processors).
Memory – 512 GiB.
Storage – 15.2 terabytes of local, SSD-based NVMe storage.
Network – 25 Gbps of ENA-based enhanced networking.

走了十年總算走到這塊了... 不過應該花了不少時間解決各種安全性的問題,像是 network isolation 以及反刷韌體的問題 XD

Amazon EC2 推出 I3 系列機器

Amazon EC2 推出使用 NVMe SSD 的機器,I3 系列:「Now Available – I3 Instances for Demanding, I/O Intensive Applications」。

以東京區的價錢來看,r4.16xlarge 與 i3.16xlarge 都是 64 vCPU 與 488GB RAM。不一樣的地方只有兩個:

  • 第一個是 r4 只有 195 vCPU,而 i3 有 200 vCPU,快了一些。
  • 第二個是 i3 多了 8 個 1900 NVMe SSD。

但價錢卻只差一些 ($5.12/hr 與 $5.856/hr),如果速度可以善用 SSD 的話,跟 r4.* 比起來其實頗超值的...

Amazon EC2 的大量新資訊

這次 re:InventAmazon EC2 的更新真是有夠多的,總集篇被整理在這邊:「EC2 Instance Type Update – T2, R4, F1, Elastic GPUs, I3, C5」。

先是 F1 系列提供 FPGA 能力:「Developer Preview – EC2 Instances (F1) with Programmable Hardware」。

再來是 T2 系列提供更大台的機器,不過往上提供的 CPU 級距還是 1.5 倍 (費用是 2 倍),主要還是給量還不夠大的使用者使用,如果夠大的就應該換去 C 系列,加上 auto scaling 的方式降低成本:「New T2.Xlarge and T2.2Xlarge Instances」。

另外一個大賣點是 GPU 變成可以掛在各種機器上,雖然還沒推出:「In the Works – Amazon EC2 Elastic GPUs」:

Today, you have the ability to set up freshly created EBS volumes when you launch new instances. You’ll be able to do something similar with Elastic GPUs, specifying the desired size during the launch process, with the option to stop, modify, and then start a running instance in order to make a change.

最後是推出了 R4、I3 與 C5 系列,主要是在於硬體升級而更新。