IPv6 Excuse Bingo (IPv6 理由伯賓果?)

在「AWS IPv4 Estate Now Worth $4.5B (toonk.io)」這邊的討論意外的看到「IPv6 Excuse Bingo」這個網站...

這個 bingo 是動態的,每次 reload 都會有不同的版本出來,理由超多...

不過實際用 IPv6 network 後會發現各種鳥問題真的多,之前 (到現在) 最經典的就是 HECogent 的 IPv6 network 因為錢的問題談不攏而不通的問題,在維基百科上面就有提到從 2009 年開始就不通了:

There is a long-running dispute between the provider Cogent Communications and Hurricane Electric. Cogent has been refusing to peer settlement-free with Hurricane Electric since 2009.

所以夠大的服務如果要弄 IPv6 都得注意到這點,像是 CDN 或是 GeoIP-based load balancer 就不能把 Cogent 的用戶導到 HE 的位置上面,反之亦然。

不過話說 AWS 手上有 128M 個 IPv4 address,整個 IPv4 address 的空間也才 4.29B,也就是說光 AWS 手上就有大約 3% 的 IPv4 address 空間,如果扣掉不可用的區段的話就更高了...

在實體機上跑 K8s 的記錄

Hacker News 上翻到「Bare-Metal Kubernetes, Part I: Talos on Hetzner (datavirke.dk)」,看起來作者試著記錄下實體機器上跑 K8s 的各種事情,系列文章的第一篇在「Bare-metal Kubernetes, Part I: Talos on Hetzner」這邊。

跟朋友聊天的時候會聊到 K8s 就是個 mini AWS,裡面包括了雲端基本該有的 building blocks,對於有自己地端機房的人可以用純軟體的方式把這些東西組出來,不用去買特殊的硬體設備來架設 (像是 F5 的 load balancer)。

另外一種用 K8s 的情境是需要真正的跨環境 (像是 container 可以自由的切來切去),像是地端與雲端的整合,或是雙雲端系統的整合。

但畢竟是自己架設一套 mini AWS,複雜度當然就不是一般架設服務可以比擬的。作者用了八篇把各個角落都帶過一次,還寫了一篇炸掉的處理:「Bare-metal Kubernetes: First Incident」。

一般的情況下應該是不需要用到這個東西,但記錄一下。

EC2-Classic 完全退役

Amazon 家的老大 (CTO & VP) Werner Vogels 貼了關於 EC2-Classic 完全退役的文章:「Farewell EC2-Classic, it’s been swell」。

2021 年的時候 AWS 的 Jeff Barr 宣布了 EC2-Classic 的退役計畫:「EC2-Classic Networking is Retiring – Here’s How to Prepare」,我當時也整理了「AWS 宣佈 EC2-Classic 退役的計畫」這篇。

當時的時間表是期望在 2022/08/15 全部退役:

On August 15, 2022 we expect all migrations to be complete, with no remaining EC2-Classic resources present in any AWS account.

但後來還是晚了整整一年,到 2023/08/15 (剛好晚了一年) 才全部退役:

On August 15, 2023, we shut down the last instance of Classic.

而公告上面的更新則是在 2023/08/23 更新:

Update (August 23, 2023) – The retirement announced in this blog post is now complete. There are no more EC2 instances running with EC2-Classic networking.

因為真的太久沒用了,看了 Werner Vogels 的描述才能回想起來當時的架構,似乎是有一大鍋這件事情,靠 security group 拆開大家:

When we launched EC2 in 2006, it was one giant network of 10.2.0.0/8. All instances ran on a single, flat network shared with other customers. It exposed a handful of features, like security groups and Public IP addresses that were assigned when an instance was spun up.

順便提一下 Werner Vogels 文章的開頭提到了 AWS 很少將服務退役,即使是 2007 年推出的 Amazon SimpleDB 也還是繼續在跑,即使現在主推的是 DynamoDB

Retiring services isn’t something we do at AWS. It’s quite rare. Companies rely on our offerings – their businesses literally live on these services – and it’s something that we take seriously. For example SimpleDB is still around, even though DynamoDB is the “NoSQL” DB of choice for our customers.

用「List of AWS Services Available by Region」這頁查了一下,SimpleDB 的區域意外的還不少,在 us-east-1us-west-1us-west-2ap-southeast-1ap-southeast-2ap-northeast-1eu-west-1 以及 sa-east-1

不過話說這開頭是不是在偷臭隔壁棚 XDDD

Amazon SES 寄到 Gmail 受到阻擋的情況

我自己沒遇過,但是 Hacker News 上看到有人有遇到,所以記錄起來:「Tell HN: Gmail rate limiting emails from AWS SES」。

Amazon SES 預設是共用 IP pool,所以遇到這種情況不算太意外,但應該是暫時性的,不過發問的作者有提到後來的解法是花 US$25/mo 使用 Dedicated IP 解決 IP reputation 的問題 (在 id=37177533 這邊):

Thanks you all for comments. I have made a decision to subscribed to dedicated IPs (credits: @slau).

The differentiating factor between our current AWS SES plan and the competitors (mentioned in the comments) is having a dedicated IP. With our current volume, none of the competitors are anyway near AWS SES costs. So, moving to a dedicated IPs thats cost 25$ extra not only solves our issue, but also no change in code/infrastructure.

記得以前另外一個教訓是,寄信還是儘量用 IPv4 address 去寄,因為 IPv6 address 的 reputation 得養頗久... 不過這個也是很久前的事情了。

AWS 弄出了 AWS Dedicated Local Zones,很像 AWS Outposts...

AWS 推出了 AWS Dedicated Local Zones:「Announcing AWS Dedicated Local Zones」。

先講 AWS Outposts,他就是提供 AWS 自己的硬體,放到用戶的機房裡面,所以依照需求有不同大小的機器,甚至是整個機櫃:

AWS Outposts is a family of fully managed solutions delivering AWS infrastructure and services to virtually any on-premises or edge location for a truly consistent hybrid experience. Outposts solutions allow you to extend and run native AWS services on premises, and is available in a variety of form factors, from 1U and 2U Outposts servers to 42U Outposts racks, and multiple rack deployments.

在「What is AWS Outposts?」這邊有詳細列出有哪些服務可以跑在上面,可以看到主要就是基礎服務,以及一些吃 local 特性的服務。

另外在「How AWS Outposts works」這邊可以看出架構上會在同一個 VPC 裡面,但是不屬於同一個 AZ 下:

而這次推出的 Dedicated Local Zones 還是有些地方沒看懂跟 AWS Outposts 差在哪裡,看起來很像是重新包裝而已...

首先是首頁提到的,這邊有提到 AWS Nitro System,所以猜測這是 AWS 的硬體,而不是自己的硬體:

Build with AWS managed secure cloud infrastructure

Benefit from the same AWS security standards that apply to AWS Regions and AWS Local Zones and are delivered with the security of the AWS Nitro System to help ensure confidentiality and integrity of customer data.

另外在公告裡面提到的服務,跟 Outposts 有些差異:

AWS services, such as Amazon EC2, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Block Store (Amazon EBS), Elastic Load Balancing (ELB), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), and AWS Direct Connect are available in Dedicated Local Zones.

另外在「AWS Dedicated Local Zones FAQs」這邊則試著說明兩者差異,但就這些句子看起來,只是不同面向的東西:

Q: How are AWS Dedicated Local Zones different from AWS Outposts?

AWS Outposts is designed for workloads that need to remain on-premises due to latency requirements, where customers want those workloads to run seamlessly with their other workloads in AWS. AWS Outposts racks are fully managed and configurable compute and storage racks built with AWS-designed hardware that allow customers to run compute and storage on-premises, while seamlessly connecting to AWS’s broad array of services in the cloud.

AWS Dedicated Local Zones are designed to eliminate the operational overhead of managing on-premises infrastructure at scale. Some customers have long-term, complex cloud migration projects and need infrastructure that seamlessly scales to support their large-scale demand. Some of these customers represent the interests of a customer community and also need multi-tenancy features to efficiently coordinate across their stakeholders. Dedicated Local Zones enable these customers to reduce the administrative burden of managing their own infrastructure on-premises with scalable, resilient, and multitenant cloud infrastructure that is fully AWS-managed and built exclusively for their use.

另外回到首頁看使用單位,目前是 GovTech Singapore,看起來就是重新包裝?

另外一個猜測是在客戶的機器上面裝 AWS Nitro System,然後裝 AWS 的軟體?這就有點怪了,而且這樣相容性之類的問題也頗麻煩,也許要指定配合的機種?

等有機會遇到的時候再跟 AWS 的人問問看好了,目前也還用不到...

Backblaze 宣佈漲價

Backblaze 宣佈漲價:「Backblaze Product and Pricing Updates」。

其中 B2 Cloud Storage 這邊最主要的改變在 Storage 的部分,這次漲了 20%,從 $5/TB 變成 $6/TB:

Storage Price: Effective October 3, 2023, we are increasing the monthly pay-as-you-go storage rate from $5/TB to $6/TB. The price of B2 Reserve will not change.

頻寬的部分增加了一些 free quota,不過在意頻寬成本的人都會用 Cloudflare 之類的方式避開了,這個其實沒有什麼差... (因為 Backblaze 流出到 Cloudflare 的流量是不計費的)

Backblaze Computer Backup 的部分沒有什麼在碰,但看起來最主要的改變是從現有的 $7/mo 漲到 $9/mo,大約 28.57%:

Computer Backup Pricing: Effective October 3, new purchases and renewals will be $9/month, $99/year, and $189 for two-year subscription plans, and Forever Version History pricing will be $0.006/GB/month.

漲幅其實頗高的,但漲完後還是市場上比較低價的產品...

Amazon EBS 十五週年,以及一些數據

AWS 的 SVP James Hamilton 寫了一篇「Amazon Elastic Block Store at 15 Years」在講 Amazon EBS 的十五週年,裡面提到了一些數字。

目前的每天的 IOPS 是 100 trillion,如果攤平的話大約是 11.57 billion IOPS/sec,如果很單純以目前高階 NVMe 卡大約是 1M IOPS/sec 這個數量級來算的話,在沒有任何 redundancy 架構,需要的量大約是萬張?以 AWS 的量感覺好像是個合理的數字... 考慮到 IOPS 主力應該是 SSD 或 NVMe 類的應用,加上 redundancy 以及保留 burst 空間的架構,最少有個十萬張... 應該不算有問題。

I asked the EBS team to quantify customer usage in 2023, the 15th year of EBS. Focusing first on daily usage, EBS delivers more than 100 trillion input/output operations per day.

另外一個是傳輸量,每天有 13EB,攤平大約是 150.46TB/sec,如果用上面提到的十萬顆來攤的話大約是需要 1.5GB/sec 的速度,拿數量級來算應該是差不多。

Perhaps even more staggering is the fact that EBS transfers more than 13 exabytes of data for customers every day.

另外一個是百萬客戶 (也許是帳號) 每天會開出三億個 EBS storage,我猜這跟機器的起起落落有關,現在 EC2 開機主要都是要掛 EBS 的 boot disk 了:

Continuing to focus on daily usage, millions of customers use EBS daily, and these millions of customers create more than 390 million EBS storage volumes each day.

的確如同 James Hamilton 說的,EBS 現在已經變成一個蠻重要的基礎建設了,很多 AWS 上的服務都是架在他上面,像是 RDS 利用了 EBS 的 block replication 組出了 readonly repica,而非走傳統的 replication 路子。

HashiCorp 內 scale 的方法

去日本前在 Hacker News 上看到「Squeeze the hell out of the system you have」這篇,用作者的名字翻了一下 LinkedIn,看起來講的是 HashiCorpSRE 事情:「Dan Slimmon」。

看的時候可以注意一下,文章裡面的觀點未必要認同,大多是他自己的看法或是想法,但裡面提到很多發生的事情,可以知道 HashiCorp 內目前搞了什麼東西。

從 LinkedIn 的資料可以看到他從 2019 就加入 HashiCorp 了,所以文章一開頭這邊講的同事應該就是 HashiCorp 的同事:

About a year ago, I raised a red flag with colleagues and managers about Postgres performance.

往下看可以看到他們有遇到 PostgreSQL 的效能問題,然後每次都是以 scale up (加大機器) 的方式解決,考慮到 HashiCorp 的產品線,我會猜應該是 Terraform Cloud 這個產品線遇到的狀況。

然後在後面提到的解法則是提到了 codebase 是 Rails,他們花了三個月的時候不斷的重複 profiling + optimizing,包括 SQL 與 PostgreSQL 的設定:

Two engineers (me and my colleague Ted – but mostly Ted) spent about 3 months working primarily on database performance issues. There was no silver bullet. We used our telemetry to identify heavy queries, dug into the (Rails) codebase to understand where they were coming from, and optimized or eliminated them. We also tuned a lot of Postgres settings.

另外一組人則是弄了 read-only replication server,把 loading 拆出去:

Two more engineers cut a path through the codebase to run certain expensive read-only queries on a replica DB. This effort bore fruit around the same time as (1), when we offloaded our single most frequent query (a SELECT triggered by polling web clients).

這兩個方法大幅降低了資料庫的 peak loading,從 90% 降到 30%:

These two efforts together reduced the maximum weekly CPU usage on the database from 90% to 30%.

可以看到都還沒用到 sharding 的技巧,目前硬體的暴力程度可以撐很久 (而且看起來是在沒有投入太多資源在 DB-related tuning 上面),快撞到的時候也還可以先用 $$ 換效能,然後投入人力開始 profiling 找問題...

Amazon EC2 推出 m7a 系列的機種

上一篇完全讀錯段落了,重寫...

Amazon EC2 推出了新的 m7a 的機種:「New – Amazon EC2 M7a General Purpose Instances Powered by 4th Gen AMD EPYC Processors」。

號稱與 m6a 相比有 50% 效能上的提升:

Today, we’re announcing the general availability of new, general purpose Amazon EC2 M7a instances, powered by the 4th Gen AMD EPYC (Genoa) processors with a maximum frequency of 3.7 GHz, which offer up to 50 percent higher performance compared to M6a instances.

不過查了一下價錢,us-east-1m6a.large 是 $0.0864/hr,m7a.large 則是 $0.11592/hr (都是 2 vCPU + 8GB RAM),漲了 34% 左右,如果計算 price performance 的話大約是 10%~15%?的確是不高所以不提 price performance,不過這次 m7a 提供了更小台的 m7a.medium (1 vCPU + 4GB RAM) 來補這塊 (m6a 最小的是 m6a.large),$0.05796/hr。

這樣看起來新的機種對於需要單核效能的應用應該會不錯?

再來是可以租到的區域,目前看起來只有歐美的傳統大區有,亞洲區還要再等等:

Amazon EC2 M7a instances are now available today in AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), and EU (Ireland).

Mountpoint for Amazon S3 正式推出 (GA) 了...

三月的時候提到 AWS 搞出了自己的 Amazon S3FUSE 實作:「AWS 官方推出了自己的 Amazon S3 FUSE 套件」,現在 GA 了:「Mountpoint for Amazon S3 – Generally Available and Ready for Production Workloads」。

看起來 s3fs-fuse 還是一直有在更新,然後翻了翻好像沒看到兩者的比較... (可能是之前 Mountpoint for Amazon S3 在 alpha 版的關係?)

記得這個功能拿來塞些東西還蠻好用的...