Amazon EFS 漲價,再推出給更「冷」的資料儲存的空間:Amazon EFS Archive

Amazon EFS 這次推出的是再多推出一個 storage class:「Optimize your storage costs for rarely-accessed files with Amazon EFS Archive」。

先前應該是 2019 的時候推出了 IA:「Amazon EFS 的 IA Storage Class」,現在的 Archive 就是新的 storage class,儲存成本更便宜,但取用成本更高。

us-east-1 的價錢來看,可以到 Archive 的成本是 IA 的一半:

Standard (GB-Month)	$0.30
Infrequent Access (GB-Month)	$0.016
Archive (GB-Month)	$0.008
Backup - Warm / Cold (GB-Month)	$0.05 / $0.01

讀取成本則是 IA 的三倍:(這邊的 Tiering 指的是自動化的搬遷的服務)

All storage classes - Reads (per GB transferred)	$0.03
All storage classes - Writes (per GB transferred)	$0.06
Infrequent Access - Reads (incremental charge per GB transferred)	$0.01
Infrequent Access - Tiering (per GB transferred)*	$0.01
Archive - Reads (incremental charge per GB transferred)	$0.03
Archive - Tiering (per GB transferred)*	$0.03

基本上就是 Amazon S3 那套分級方法陸陸續續搬過來的感覺。

然後注意到這個「Regional (Multi-AZ) with Elastic Throughput」是新的計價方案,就算是 Standard storage class,I/O 是要算錢的。

在舊的方案「Regional (Multi-AZ) with legacy throughput modes」裡面,Standard 的 I/O 是不用額外付費,已經包在裡面,除非你直接購買 Provisioned (保證速度):

Standard (GB-Month)	$0.30
Infrequent Acces (GB-Month)	$0.025
Backup - Warm / Cold (GB-Month)	$0.05 / $0.01
Provisioned Throughput (MB/s-Month)	$6.00
Infrequent Access - Reads (per GB transferred)	$0.01
Infrequent Access - Tiering (per GB transferred)*	$0.01

翻了一下 Internet Archive 可以確認前幾天 2023/11/26 的 pricing 頁面還是舊的,也就是說這是這次推出來的改變:「Amazon EFS Pricing」。

看了一下目前 blog 上最近掛 Amazon EFS 類別的三篇都沒提到這件事情 (「New – Announcing Amazon EFS Elastic Throughput」、「Optimize your storage costs for rarely-accessed files with Amazon EFS Archive」以及「Replication failback and increased IOPS are new for Amazon EFS」),要用的人自己注意一下?

這次 Amazon EFS 兩個新推出的項目:Elastic Throughput 與更低的 latency

這次 re:Invent 關於 Amazon EFS 推出來的新東西,目前有看到兩個,第一個是「New – Announcing Amazon EFS Elastic Throughput」,介紹 Elastic Throughput。

傳統的 Busrting Throughput 模式會依照你的使用空間分配對應的速度,基礎是 50MB/sec per TB 計算,但可以 burst 到 100MB/sec per TB:

When burst credits are available, a file system can drive throughput up to 100 MiBps per TiB of storage, up to the Amazon EFS Region's limit, with a minimum of 100 MiBps. If no burst credits are available, a file system can drive up to 50 MiBps per TiB of storage, with a minimum of 1 MiBps.

而 Elastic Throughput 是一種高效能的模式,可以提供 3GB/sec 的讀取速度與 1GB/sec 的寫入速度:

Elastic Throughput allows you to drive throughput up to a limit of 3 GiB/s for read operations and 1 GiB/s for write operations per file system in all Regions.

但這然是有代價的,Elastic Throughput 的計費方式按照傳輸量計算,以 us-east-1 的計價來說,讀取是 $0.03/GB,寫入是 $0.06/GB。

粗粗算了一下,比較適合短時間要很大量快速讀寫的應用。如果是不在意時間的 (像是 cron job) 就不需要 Elastic Throughput... 然後 home 目錄拿來用可能是個不錯的選擇?

第二個推出的項目是不用錢的,是 Amazon EFS 效能的改進,降低 latency:「AWS announces lower latencies for Amazon Elastic File System」。

首先是讀取的效能提昇,以敘述看起來像是加上了 cache 層產生的效能改進:

Amazon EFS now delivers up to 60% lower read operation latencies when working with frequently-accessed data and metadata.

另外是對小檔寫入有做處理:

In addition, EFS now delivers up to 40% lower write operation latencies when working with small files (<64 KB) and metadata.

不過這些改進只有在新的 EFS 才會有,而且這波只有 us-east-1 上:

These enhancements are available automatically for all new EFS file systems using General Purpose mode in the US East (N. Virginia) Region, and will become available in the remaining AWS commercial regions over the coming weeks.

Amazon EFS 的 file lock 限制

看到「Amazon EFS now supports a larger number of concurrent file locks」這篇提到:

This Amazon EFS update increases the number of simultaneous file locks an NFS mount can acquire to 65,536 (from 8,192 previously), enabling Amazon EFS to be used for a broader set of applications that heavily leverage file locking (including message broker and distributed analytics applications).

所以 NFS 的部份先前有 8192 個的上限在,現在則是拉 65536 個了。

不過提到 message broker,應該是以前的應用利用 NFS file lock 來做一些事情?

Amazon EFS 效能提昇的一些討論

上一篇「Amazon EFS 的效能提昇」提到 Amazon EFS 的效能提昇,在 Hacker News 上看到 Amazon EFS 團隊的 PMT (Product-Manager-Technical) 出來回一些東西:「Amazon Elastic File System Update – Sub-Millisecond Read Latency (amazon.com)」,搜尋 geertj 應該就可以看到他回的東西了...

像是即使是 Jeff Barr 發表這篇文章,也還是經過 legal team 的同意才能發表:

(PMT on the EFS team).

Yes, the wordings are carefully formulated as they have to be signed off by the AWS legal team for obvious reasons. With that said, this update was driven by profiling real applications and addressing the most common operations, so the benefits are real. For example, a simple WordPress "hello world" is now about 2x as fast as before.

另外這次的效能提昇是透過 cache 層達成的:

I'm the PMT for this project in the EFS team. The "flip the switch" part was indeed one of the harder parts to get right. Happy to share some limited details. The performance improvement builds on a distributed consistent cache. You can enable such a cache in multiple steps. First you deploy the software across the entire stack that supports the caching protocol but it's disabled by configuration. Then you turn it for the multiple components that are involved in the right order. Another thing that was hard to get right was to ensure that there are no performance regressions due to the consistency protocol.

然後在每個 AZ 都有 cache:

The caches are local to each AZ so you get the low latency in each AZ, the other details are different. Unfortunately I can't share additional details at this moment, but we are looking to do a technical update on EFS at some point soon, maybe at a similar venue!

另外看起來主要就是 metadata cache 的幫助:

NFS workloads are typically metadata heavy and highly correlated in time, so you can achieve very high hit rates. I can't share any specific numbers unfortunately.

還是有很多細節數字不能透漏,但知道是透過 cache 達成的就已經可以大致上想像後面是怎麼弄出來的了...

Amazon EFS 提供 Replication 功能

Jeff Barr 在官方 blog 上宣佈 Amazon EFS 提供 replication 功能:「New – Replication for Amazon Elastic File System (EFS)」。

可以看到跨區的設定畫面:

在建起來以後會是 read-only filesystem:

另外有提供 fail-over 機制,當 fail-over 過去後會從 read-only 變成 read-write。

不過要注意,架構上屬於 eventually consistent,預期是一分鐘內會更新。這點算是可以預期的,不然 latency 會太高:

All replication traffic stays on the AWS global backbone, and most changes are replicated within a minute, with an overall Recovery Point Objective (RPO) of 15 minutes for most file systems.

然後 replication 不會計算到 I/O 的 credit 與 throughput,算是比較特別的一點:

Replication does not consume any burst credits and it does not count against the provisioned throughput of the file system.

replication 這個服務本身不另外收費,只收取 EFS 使用的空間以及 replication 產生的頻寬費用:

You pay the usual storage fees for the original and replica file systems and any applicable cross-region or intra-region data transfer charges.

Amazon EFS 推出 One Zone 版本

Amazon EFS 提供 One Zone 的版本,用較低的可靠度提供更低的價錢:「New – Lower Cost Storage Classes for Amazon Elastic File System」。

價錢大約是 53 折,不過要注意不在同一個 AZ 時使用會有頻寬費用:

Standard data transfer fees apply for inter-AZ or inter-region access to file systems.

目前想的到的是 /net/tmp 這類的用途,資料掉了也就算了,考慮到可靠度,其他的用途好像暫時想不到...

EFS 上可以掛 AWS Transfer Family 了

先前 AWS Transfer Family 的後端只能是 Amazon S3,現在則是宣佈可以掛 Amazon EFS 了:「New – AWS Transfer Family support for Amazon Elastic File System」。

EFS 跟 S3 都是沒有空間限制,但 EFS 可以直接在系統上掛起來當作一般的檔案系統用,基本上就是更方便,不過代價就是單位儲存成本貴不少...

這次支援 EFS 對於一些量不大的處理又方便不少,也就是處理完後的檔案另外丟,而上傳上來的檔案可以砍掉的... 如果是上傳上來的檔案需要保留的,用 S3 會比較適合。

AWS 宣佈提昇 Amazon EFS 的最低效率

AWS 宣佈提昇 Amazon EFS 的最低效率:「Amazon Elastic File System increases file system minimum throughput」。

第一段裡的幾個數字差不多就是重點了:

Amazon Elastic File System (Amazon EFS) file systems using the default bursting throughput mode now have a minimum throughput of 1 MiB/s. All EFS bursting mode file systems (regardless of size) can drive 100 MiB/s of throughput, and file systems with more than 1TiB of Standard class storage can drive 100 MiB/s per TB when burst credits are available. This change increases the minimum throughput from 50KiB/s per GiB of Standard class storage to a fixed minimum of 1 MiB/s for file systems with less than 20 GiB of Standard class storage, when burst credits are exhausted.

本來最低保證效率是每 GB 提供 50KB/sec,也就是要使用到 20GB 才會提供 1MB/sec,現在對於不到 20GB 的使用者,直接拉高到固定 1MB/sec。

這對於剛開始用的使用者會方便一些,不過 EFS 主要還是方便在不同機器上共享,效率上還是本機掛 EBS 好很多 (因為 OS 可以 cache)。

先前在 AWS 上把 /home 丟到 EFS 上面,結果因為 i/o 都需要透過網路的關係,編 pyenv 超慢,後來找一天把東西都丟回 EBS 上,速度快多了...

Lambda 可以掛 EFS 了

AWS Lambda 可以掛 Amazon EFS 了:「New – A Shared File System for Your Lambda Functions」。

這有點像是一開始 Amazon EC2 只能把資料存到 Amazon S3 上,後來支援 EBS 的感覺:這使得很多程式可以直接用內建的 library 操作檔案系統,而不需要掛 AWS 專用的 library 操作 Amazon S3。

有了一個 filesystem 後馬上就可以想到很多惡搞的方法,像是用 lambda 搞 serverless PHP 之類的,之後應該會看到很有「創意」的玩法...

FreeBSD 12.2 在 AWS 的 Amazon EFS 整合 (autofs)

Colin Percival 提到了 FreeBSD 12.2 上 autofs 會整合 Amazon EFS,讓掛載進來變得更方便:「Some new FreeBSD/EC2 features: EFS automount and ebsnvme-id」。

用法是先設定 autofs,然後啟用 autofs:

# echo '/efs -efs' > /etc/auto_master 
# sysrc autofs_enable="YES"

然後重開機後就可以直接切到 /efs/FSID 把 EFS 掛起來了:

Having done this, any access to the path /efs/FSID (e.g., /efs/fs-01234567) will automatically and transparently mount that filesystem.

另外加上原來對 EBS 與 ephemeral disk 的支援,這樣 storage 的部份算是該有的都有了:

Using the tool and some devd magic, FreeBSD now maintains a tree under /dev/aws/disk containing the symlinks of the forms

  • /dev/aws/disk/ebs/vol-0123456789abcdef
  • /dev/aws/disk/linuxname/sdh
  • /dev/aws/disk/ephemeral/SERIALNO