AWS 上用空間買 IOPS 的故事...

在「A web performance issue」這邊講到 Mozilla 的系統產生效能問題,後續的 trouble shooting 以及解決問題的方案。

這個系統跑在 AWS 上,在一連串確認後發現是 RDS 所使用的 EBS 的 IOPS 滿了:

After reading a lot of documentation about Amazon’s RDS set-up I determined that slow downs in the database were related to IOPS spikes. Amazon gives you 3 IOPS per Gb and with a storage of 1 Terabyte we had 3,000 IOPS as our baseline. The graph below shows that at times we would get above that max baseline.

然後大家對於解法都差不多,因為 Provisioned IOPS 太貴,所以直接加大空間換 IOPS 出來 (因為 General SSD 裡 1 GB 給 3 IOPS):

To increase the IOPS baseline we could either increase the storage size or switch from General SSD to Provisioned IOPS storage. The cost of the different storage type was much higher so we decided to double our storage, thus, doubling our IOPS baseline. You can see in the graph below that we’re constantly above our previous baseline. This change helped Treeherder’s performance a lot.

然後再設警告機制,下次就可以提前再拉昇:

In order to prevent getting into such a state in the future, I also created a CloudWatch alert. We would get alerted if the combined IOPS is greater than 5,700 IOPS for 6 datapoints within 10 minutes.

不過 General SSD 的 IOPS 是沒有 100% 保證的,只有這樣寫:

AWS designs gp2 volumes to deliver 90% of the provisioned performance 99% of the time.

大多數的情況應該是夠用啦...

Backblaze 開了歐洲區機房

Backblaze 開了歐洲機房,所以包括了一般性的 Computer BackupB2 Cloud Storage 都可以選擇要放哪邊了...

歐洲的點是放在荷蘭:

Big news: Our first European data center, in Amsterdam, is open and accepting customer data!

價錢也都跟美國的相同:

Whether you choose EU Central or US West, your pricing for our products will be unchanged:

對於在意資料放美國機房的問題應該有緩解一些...

Backblaze B2 的 Copy File API 終於開放

BackblazeB2 算是我還蠻愛用來丟一些東西的地方 (配合他們與 Cloudflare 合作的免費頻寬)。

先前 B2 一直沒有複製檔案的功能,如果要有同樣檔案,變成得自己再上傳一次,這對於網路沒有很快的使用者會很痛苦,現在總算是提供 API 可以直接複製了:「B2 Copy File is Now Public」。

這個功能主要的文件在「b2_copy_file」。

另外這次也推出了「b2_copy_part」,針對檔案的合併所提供的 API。

AWS 的 EBS 預設型態改為 GP2 (SSD)

AWS 宣佈 EBS 的預設型態從 Standard 變成 GP2:「EBS default volume type updated to GP2」。

包括 web console 與 API 的預設值都改成 GP2:

The AWS console defaults to GP2 in all regions. On July 29th the default EBS volume type was updated in thirteen regions from Standard to GP2. Now AWS API calls for volume, image, and instance creation also default to GP2 in all regions.

GP2 是 SSD,所以可以提供比較低的 latency,而另外一個用 GP2 的好處是 i/o 的費用已經含在內了 (Standard 會另外收取費用),對於成本估算會比較簡單一些,尤其是 i/o 量比較大的時候。

RDS 支援 Storage Auto Scaling

Amazon RDS 推出了 Storage Auto Scaling:「Amazon RDS now supports Storage Auto Scaling」。

看起來傳統 RDBMS 類的都支援 (也就是非 Aurora 的這些):

Starting today, Amazon RDS for MariaDB, Amazon RDS for MySQL, Amazon RDS for PostgreSQL, Amazon RDS for SQL Server and Amazon RDS for Oracle support RDS Storage Auto Scaling.

仔細看了一下新聞稿,裡面都只有提到 scale up,沒有提到 scale down,這個功能應該是只會提昇不會下降,所以要注意突然用很多空間,再砍掉後的問題:

RDS Storage Auto Scaling automatically scales storage capacity in response to growing database workloads, with zero downtime.

RDS Storage Auto Scaling continuously monitors actual storage consumption, and scales capacity up automatically when actual utilization approaches provisioned storage capacity.

除了香港外的所有商業區域都提供:

RDS Storage Auto Scaling is available in all commercial AWS regions except in Asia Pacific (Hong Kong) and AWS GovCloud.

Amazon EBS 的預設加密機制

EBS 有選項可以預設開加密了:「New – Opt-in to Default Encryption for New EBS Volumes」。

不算太意外的,這個選項要一區一區開:

Per-Region – As noted above, you can opt-in to default encryption on a region-by-region basis.

也不太意外的,無法完全公開共用:

AMI Sharing – As I noted above, we recently gave you the ability to share encrypted AMIs with other AWS accounts. However, you cannot share them publicly, and you should use a separate account to create community AMIs, Marketplace AMIs, and public snapshots. To learn more, read How to Share Encrypted AMIs Across Accounts to Launch Encrypted EC2 Instances.

然後有些服務有自己的 EBS 設定,這次不受影響。而有些底層其實是用 EC2 的服務 (然後開 EBS) 就會直接套用了:

Other AWS Services – AWS services such as Amazon Relational Database Service (RDS) and Amazon WorkSpaces that use EBS for storage perform their own encryption and key management and are not affected by this launch. Services such as Amazon EMR that create volumes within your account will automatically respect the encryption setting, and will use encrypted volumes if the always-encrypt feature is enabled.

用 Google Docs 惡搞的方式...

看到「UDS : Unlimited Drive Storage」這個專案,利用 Google Docs 存放資料。主要的原因是因為 Google Docs 不計入 Google Drive 所使用的空間:

Google Docs take up 0 bytes of quota in your Google Drive

用這個方法可以存放不少大檔案 (像是各種 ISO image),讓人想起當年 Love Machine 的玩法 (不知道的人可以參考「愛的機器 Love machine」這篇),切割檔案後傳到某些空間以提供下載?只是這邊是用 base64 放到 Google Docs 上...

base64 的資料會比原始資料大 33%,而 Google Docs 單篇的上限大約是 710KB:

Size of the encoded file is always larger than the original. Base64 encodes binary data to a ratio of about 4:3.

A single google doc can store about a million characters. This is around 710KB of base64 encoded data.

方法不是太新鮮,但是讓人頗懷念的... XD

Amazon S3 淘汰 Path-style 存取方式的新計畫

先前在「Amazon S3 要拿掉 Path-style 存取方式」提到 Amazon S3 淘汰 Path-style 存取方式的計畫,經過幾天後有改變了。

Jeff Barr 發表了一篇「Amazon S3 Path Deprecation Plan – The Rest of the Story」,裡面提到本來的計畫是 Path-style model 只支援到 2020/09/30,被大幅修改為只有在 2020/09/30 後建立的 bucket 才會禁止使用 Path-style:

In response to feedback on the original deprecation plan that we announced last week, we are making an important change. Here’s the executive summary:

Original Plan – Support for the path-style model ends on September 30, 2020.

Revised Plan – Support for the path-style model continues for buckets created on or before September 30, 2020. Buckets created after that date must be referenced using the virtual-hosted model.

這樣大幅降低本來會預期的衝擊,但 S3 團隊希望償還的技術債又得繼續下去了... 也許再過個幾年後才會再被提出來?

AWS 推出更便宜的儲存方案 Glacier Deep Archive

AWS 推出的這個方案價錢又更低了:「New Amazon S3 Storage Class – Glacier Deep Archive」。

在這之前在 us-east-1S3 最低的方案是 Glacier Storage,單價是 USD$0.004/GB (也就是 $4/TB)。

而這次推出的 Glacier Deep Archive Storage 在同一區則是直接到 USD$0.00099/GB ($0.99/TB),大約是 1/4 的價錢。

Glacier Deep Archive 在取得時 first byte 的保證時間是 12 小時,另外最低消費是 180 天:

Retrieval time within 12 hours

先前就有的 Glacier Storage 則是可以在取用時設定取得的 pattern (會影響 first byte 的時間),而最低消費是 90 天:

Configurable retrieval times, from minutes to hours

Pricing for each of these metrics is determined by the speed at which data is requested based on three options. "Expedited" queries <250 MB are typically returned in 1-5 minutes. "Standard" queries are typically returned in 3-5 hours. "Bulk" queries are typically returned in 5-12 hours.

多一個更便宜的選擇可以用。