用 SSD 的 I/O 暴力解

這篇「Achieving 11M IOPS & 66 GB/s IO on a Single ThreadRipper Workstation」用了 AMD 平台上的 PCI-e 4.0 硬幹出 4K 隨機讀取 11M IOPS 的速度,另外在大區塊讀取可以到 66GB/sec,後面這個速度應該是可以把 DDR4 記憶體頻寬吃滿...

硬體的部份,作者用了 8*1TB + 2*500GB 的 M.2 SSD 來建這組系統,然後接到卡上:

不過他好像沒提到這組機器的價錢 (雖然每個單品都查的到),大概算了一下 storage 的部份其實不怎麼貴,Samsung 980 Pro PCIe 4.0 M.2 SSD 的部份,1TB 每一條要 USD$160,500GB 要 USD$120,旁邊那些 CPU 與記憶體反而貴不少... 不過整台機器應該有機會在 USD$10000 搞定?

感覺拿來給剪片的人用很爽?至少在處理讀寫的時候應該是很順...

Amazon EBS 的 io2 給了不少新消息...

Amazon EBS 的另外一個新推出的東西,是針對 io2 的改善:

前面兩則消息可以一起看,主要是推出了 EBS Block Express,有著效能上的提昇:

Built on our new EBS Block Express architecture that takes advantage of some advanced communication protocols implemented as part of the AWS Nitro System, the volumes will give you up to 256K IOPS & 4000 MBps of throughput and a maximum volume size of 64 TiB, all with sub-millisecond, low-variance I/O latency. Throughput scales proportionally at 0.256 MB/second per provisioned IOPS, up to a maximum of 4000 MBps per volume. You can provision 1000 IOPS per GiB of storage, twice as many as before. The increased volume size & higher throughput means that you will no longer need to stripe multiple EBS volumes together, reducing complexity and management overhead.

目前因為是 preview 階段,想要用的人需要申請測試。要注意目前支援的區域有限 (不像這次推出 gp3 的時候就是全區),而且需要搭配 r5b 的機器:

The preview is currently available in the US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Tokyo), and Europe (Frankfurt) Regions. During the preview, we support the use of R5b instances, with support for other Nitro-powered instances in the works.

第三則消息則是在講 io2 的 IOPS 的折扣,針對購買 32K IOPS 以上的部份會有 30% 折扣:

Now, with the new tiered pricing structure, the first 32,000 IOPS provisioned on a volume are charged at the current base rate ($0.065 per provisioned IOPS-mo) and the second tier between 32,001 and 64,000 is charged at a 30% lower rate ($0.046 per provisioned IOPS-mo).

針對前面提到的 preview 版本 (EBS Block Express),因為可以超過 64K IOPS,這個部份的價錢會更低,再疊一次 30% 的折扣:

Furthermore, for customers who have even higher performance requirement than currently supported by a single io2 volume today, we are previewing io2 volumes that run on EBS Block Express, the next generation of our block storage architecture. io2 Block Express volumes can be provisioned to deliver peak IOPS of 256,000. For these volume, any IOPS provisioned over 64,000 IOPS will be charged at a further 30% lower rate than the second tier ($0.032 per provisioned IOP-mo for IOPS over 64,000). This lowers the effective rate to $0.038 per provisioned IOPS on a volume provisioned with 256,000 IOPS.

算是要衝效能的人用的,目前平常應該還是會用 gp2 或是 gp3 的 SSD...

Amazon EBS 推出了 gp3

今年的 AWS re:Invent 又開始了,不過因為疫情的關係,這次是線上為主... 這邊先來整理一下 Amazon EBS 相關的更新。

首先是推出了新的 gp3 類型,也是 SSD 類:「New – Amazon EBS gp3 Volume Lets You Provision Performance Apart From Capacity」。

每 GB 單位成本比 gp2 低 20%:

Today I would like to tell you about gp3, a new type of SSD EBS volume that lets you provision performance independent of storage capacity, and offers a 20% lower price than existing gp2 volume types.

然後直接給你 3000 IOPS 與 125MB/sec,有需要更高的話可以「加購」:

gp3 is designed to provide predictable 3,000 IOPS baseline performance and 125 MiB/s regardless of volume size. It is ideal for applications that require high performance at a low cost such as MySQL, Cassandra, virtual desktops and Hadoop analytics. Customers looking for higher performance can scale up to 16,000 IOPS and 1,000 MiB/s for an additional fee. The top performance of gp3 is 4 times faster than max throughput of gp2 volumes.

但照「Amazon EBS volume types」這邊的列表可以看到,要注意 gp2 可以 burst 的 throughput (250MB/sec) 比 gp3 的 baseline (125MB/sec) 高。

也因為這樣,可以把一些 random access 比較多的 /data 這類的 EBS 換過去,但如果是要大量 sequential access 的也許就不適合了。

IOPS 的部份,1TB 以下的 gp2 換過去應該是沒什麼太大問題,因為在 gp2 的時候是 1GB 給 3IOPS,所以 1TB 以下的 gp2 都低於 3000IOPS。

轉移的部份可以在 AWS 的 console 上直接 migrate 到 gp3

If you’re currently using gp2, you can easily migrate your EBS volumes to gp3 using Amazon EBS Elastic Volumes, an existing feature of Amazon EBS. Elastic Volumes allows you to modify the volume type, IOPS, and throughput of your existing EBS volumes without interrupting your Amazon EC2 instances.

像是這樣:

但照「Amazon EBS volume types」這邊的列表,gp3 可以是開機硬碟,但是改不過去啊 XDDD

Update:剛剛發現文件被修正了,看起來不能當開機硬碟...

不知道哪邊搞錯了,過幾天看看吧 XDDD

Amazon EBS 推出新的 io2 類型

Amazon EBS 保障 IO 效能的版本 Provisioned IOPS io1 進化了,推出了 Provisioned IOPS io2:「New EBS Volume Type (io2) – 100x Higher Durability and 10x More IOPS/GiB」。

目前先看了一下美國與新加坡的價錢,應該都還是一樣,看起來這次主要是功能性上的進步,有兩個比較顯著的改變。

第一個是每 GB 可以租用的 IOPS 數量上升了,io1 是 50 IOPS,在 io2 給到 500 IOPS;不過最大值不變,都還是 64k IOPS。這個看起來對於追求 IOPS 的人彈性增加不少,不過算了一下成本差距應該是還好,以最大的 64k IOPS 來算,光 IOPS 的費用就要 USD$4160/month,而最低的空間租量上,io1 租 1280 GB (USD$160/month) 與 io2 租 128 GB (USD$16/month) 在這個部分只能算零頭了...

第二個是持久性 (durability),從 io1 的 99.9% 到 io2 變成 99.999% 了,這邊應該主要是受益於這十年 SSD 技術的進步。我猜本來的 io1 其實也拉高不少,只是 SLA 合約上沒有增加而已...

應該還是會守在 gp2 上,便宜大碗,不過效能的保證少了些,對於一般性的應用來說應該是夠用。

AWS 宣佈提昇 Amazon EFS 的最低效率

AWS 宣佈提昇 Amazon EFS 的最低效率:「Amazon Elastic File System increases file system minimum throughput」。

第一段裡的幾個數字差不多就是重點了:

Amazon Elastic File System (Amazon EFS) file systems using the default bursting throughput mode now have a minimum throughput of 1 MiB/s. All EFS bursting mode file systems (regardless of size) can drive 100 MiB/s of throughput, and file systems with more than 1TiB of Standard class storage can drive 100 MiB/s per TB when burst credits are available. This change increases the minimum throughput from 50KiB/s per GiB of Standard class storage to a fixed minimum of 1 MiB/s for file systems with less than 20 GiB of Standard class storage, when burst credits are exhausted.

本來最低保證效率是每 GB 提供 50KB/sec,也就是要使用到 20GB 才會提供 1MB/sec,現在對於不到 20GB 的使用者,直接拉高到固定 1MB/sec。

這對於剛開始用的使用者會方便一些,不過 EFS 主要還是方便在不同機器上共享,效率上還是本機掛 EBS 好很多 (因為 OS 可以 cache)。

先前在 AWS 上把 /home 丟到 EFS 上面,結果因為 i/o 都需要透過網路的關係,編 pyenv 超慢,後來找一天把東西都丟回 EBS 上,速度快多了...

AWS 上用空間買 IOPS 的故事...

在「A web performance issue」這邊講到 Mozilla 的系統產生效能問題,後續的 trouble shooting 以及解決問題的方案。

這個系統跑在 AWS 上,在一連串確認後發現是 RDS 所使用的 EBS 的 IOPS 滿了:

After reading a lot of documentation about Amazon’s RDS set-up I determined that slow downs in the database were related to IOPS spikes. Amazon gives you 3 IOPS per Gb and with a storage of 1 Terabyte we had 3,000 IOPS as our baseline. The graph below shows that at times we would get above that max baseline.

然後大家對於解法都差不多,因為 Provisioned IOPS 太貴,所以直接加大空間換 IOPS 出來 (因為 General SSD 裡 1 GB 給 3 IOPS):

To increase the IOPS baseline we could either increase the storage size or switch from General SSD to Provisioned IOPS storage. The cost of the different storage type was much higher so we decided to double our storage, thus, doubling our IOPS baseline. You can see in the graph below that we’re constantly above our previous baseline. This change helped Treeherder’s performance a lot.

然後再設警告機制,下次就可以提前再拉昇:

In order to prevent getting into such a state in the future, I also created a CloudWatch alert. We would get alerted if the combined IOPS is greater than 5,700 IOPS for 6 datapoints within 10 minutes.

不過 General SSD 的 IOPS 是沒有 100% 保證的,只有這樣寫:

AWS designs gp2 volumes to deliver 90% of the provisioned performance 99% of the time.

大多數的情況應該是夠用啦...

Amazon EBS (gp2) 提昇效能...

AWS 宣佈提昇了 Amazon EBS (gp2) 的效能:「Amazon EBS Increases Performance of General Purpose SSD (gp2) Volumes」。

本來上限是 10k IOPS,現在提升到 16k IOPS 了。另外最大傳輸速度也從 160 MB/sec 提升到 250 MB/sec:

Today we are announcing a 60% improvement in performance of General Purpose SSD (gp2) Volumes from 10,000 IOPS to 16,000 IOPS and from 160 MB/s to 250 MB/s of throughput per volume.

應該還是維持 3 IOPS per GB 的設計,但這對於想要用 gp2 堆效能的人來說算是好消息,可以用更大的空間堆出更多 IOPS 了... (像是「Percona 的人接受 AWS 的建議,重新測試了 Percona XtraDB Cluster 在 gp2 上的效能...」這篇)

Percona 的人接受 AWS 的建議,重新測試了 Percona XtraDB Cluster 在 gp2 上的效能...

去年年底的時候 Percona 的人在 AWS 上測試 Percona XtraDB Cluster 的效能,尤其是針對底層應該選擇哪種 EBS 的部分給了一些建議。可以參考先前寫的「Percona 分析在 AWS 上跑 Percona XtraDB Cluster 的效能 (I/O bound)」這篇。

當時的建議是用 io1,雖然是比較貴,但對於效能比較好。

而後來 Percona 的人收到 AWS 工程師的建議,可以用另外一個方式,可以在 gp2 上拉出類似的效能,但成本會比 io1 低不少:「Percona XtraDB Cluster on Amazon GP2 Volumes」。

這個方式是利用 gp2 會依照空間大小,計算可用的 IOPS。在官方的文件裡是這樣描述 gp2 的效能 (IOPS):

General Purpose SSD (gp2) volumes offer cost-effective storage that is ideal for a broad range of workloads. These volumes deliver single-digit millisecond latencies and the ability to burst to 3,000 IOPS for extended periods of time. Between a minimum of 100 IOPS (at 33.33 GiB and below) and a maximum of 10,000 IOPS (at 3,334 GiB and above), baseline performance scales linearly at 3 IOPS per GiB of volume size. AWS designs gp2 volumes to deliver the provisioned performance 99% of the time. A gp2 volume can range in size from 1 GiB to 16 TiB.

在這個前提下,需要 10000 IOPS 的效能會需要 3.3TB 以上的空間,所以 Percona 就被 AWS 的工程師建議直接拉高空間重新測試:

After publishing our material, Amazon engineers pointed that we should try GP2 volumes with the size allocated to provide 10000 IOPS. If we allocated volumes with size 3.3 TiB or more, we should achieve 10000 IOPS.

首先是測出來的效能,可以看到沒有太大差異:

接下來就比較儲存成本,大約是 io1 版本的一半價錢:

如上面文件中提到的,gp1 不完全保證效能,但統計出來經常能夠提供出 3 IOPS/GB 的效能。而 io1 則是保證效能,不太需要擔心效能不穩定的問題。就是這個差異,反應到成本上面就有蠻大的差距。善用這點設計系統,應該會對整體成本有蠻大的幫助... (但對 latency 就未必了,尤其是 P99 之類的數值)

算是另外一種搞法讓大家可以考慮...

AWS 提昇了 Amazon EBS 能提供的效能上限

AWS 宣佈 Amazon EBS 可以提供的效能往上提高了 (這邊講的是 Provisioned IOPS SSD,代號 io1):「Amazon EBS Improves Performance for io1 Volumes」。

單一 volume 的 IOPS 從 20K 變成 32K,thoughput 從 320MB/sec 變成 500MB/sec:

Today we are announcing an improvement in performance of Provisioned IOPS SSD (io1) Volumes from 20,000 IOPS to 32,000 IOPS and from 320 MB/s to 500 MB/s of throughput per volume.

應該是科技的進步帶動的 XD

Amazon RDS 支援更大的硬碟空間與更多的 IOPS

Amazon RDS 的升級:「Amazon RDS Now Supports Database Storage Size up to 16TB and Faster Scaling for MySQL, MariaDB, Oracle, and PostgreSQL Engines」。

空間上限從 6TB 變成 16TB,而且可以無痛升。另外 IOPS 上限從 30K 變成 40K:

Starting today, you can create Amazon RDS database instances for MySQL, MariaDB, Oracle, and PostgreSQL database engines with up to 16TB of storage. Existing database instances can also be scaled up to 16TB storage without any downtime.

The new storage limit is an increase from 6TB and is supported for Provisioned IOPS and General Purpose SSD storage types. You can also provision up to 40,000 IOPS for Provisioned IOPS storage volumes, an increase from 30,000 IOPS.

不過隔壁的 Amazon Aurora 還是大很多啊 (64TB),而且實際上不用管劃多大,他會自己長大:

Q: What are the minimum and maximum storage limits of an Amazon Aurora database?

The minimum storage is 10GB. Based on your database usage, your Amazon Aurora storage will automatically grow, up to 64 TB, in 10GB increments with no impact to database performance. There is no need to provision storage in advance.