AWS 東京區的機器有 μs 等級的 Amazon Time Sync Service 可以用

AWS 宣佈 μs 等級的 Amazon Time Sync Service,也就是比 ms 再多三個零的等級:「Amazon Time Sync Service now supports microsecond-accurate time」。

一般的 Time Sync Service 服務大多是 ms 等級,也就是 10-3 秒這個等級,μs 則是到了 10-6...

但這其實有點詭異,目前只有 AWS 東京區有這個服務,而且限制在 r7g 的機器上才能用:

Amazon Time Sync with microsecond-accurate time is available starting today in the Asia Pacific (Tokyo) Region on all R7g instances, and we will be expanding support to additional AWS Regions and EC2 Instance types.

也就是說 us-east-1us-west-2 這些第一線熱門的區域沒有,另外只有特別的 ARM 機種,而且是 memory-intensive 的 r7g 才能用?

先放著看看 XDDD

Amazon EBS 在 Compliance mode 下的 Snapshot Lock

Jeff Barr 寫了「New – Amazon EBS Snapshot Lock」這篇,介紹 Amazon EBS 的新功能 Snapshot Lock。

從名字就知道是鎖住 snapshot 不讓人刪除,比較特別的是有兩個模式,第一個是 Governance,這個模式下就只是防止誤刪除的情況:

This mode protects snapshots from deletions by all users. However, with the proper IAM permissions, the lock duration can be extended or shortened, the lock can be deleted, and the mode can be changed from Governance mode to Compliance mode.

比較重要的是第二個模式 Compliance,在超過猶豫期 (cooling-off period) 後就不能動了,就算你有最大的權限 (我猜是連 root account 也不能動),唯一能操作的只有延長 lock 時間:

This mode protects snapshots from actions by the root user and all IAM users. After a cooling-off period of up to 72 hours, neither the snapshot nor the lock can be deleted until the lock duration expires, and the mode cannot be changed. With the proper IAM permissions the lock duration can be extended, but it cannot be shortened.

的確是遵循法規用的功能...

AWS 宣布 EKS 支援期從 14 個月變成 26 個月

有訂 RSS feed 但應該是漏看了,後來在 X (Twitter) 上看到 qrtt1 轉貼 ingramchen 的貼文才注意到的,AWS 宣布 EKS 支援期從本來的 14 個月 (跟 k8s 官方相同) 變成 26 個月 (多了一年):

公告在「Amazon EKS Extended support for Kubernetes Versions now available in preview」以及「Amazon EKS extended support for Kubernetes versions available in preview」這邊。

這邊查了資料發現有點複雜,KEP-2572: Defining the Kubernetes Release Cadence 這邊有正式的說明,但「Kubernetes | endoflife.date」這邊比較清楚 14 個月是怎麼來的:

Kubernetes follows an N-2 support policy (meaning that the 3 most recent minor versions receive security and bug fixes) along with a 15-week release cycle. This results in a release being supported for 14 months (12 months of support and 2 months of upgrade period).

雖然是付費功能,但目前這個功能是 preview,代表不受到 SLA 之類的保障,應該是之後成熟了再看情況變成 GA:

Today, we’re announcing the preview of Amazon Elastic Kubernetes Service (EKS) extended support for Kubernetes versions. You can now run Amazon EKS clusters on a Kubernetes version for up to 26 months from the time the version is generally available on Amazon EKS. Extended Support is available as a free preview for all Amazon EKS customers, starting today with Kubernetes version 1.23.

算是還不錯的消息?就目前看到的經驗,每次的升級都會爛掉不少東西,所以沒用到新功能卻被迫要更新的次數可以降低一些,總是好事...

Google SRE 團隊整理出過去二十年的十一條心得

Google 的 SRE 團隊整理出過去二十年的心得,當看故事的心態在看的:「Lessons Learned from Twenty Years of Site Reliability Engineering」,在 Hacker News 上也有討論:「Lessons Learned from Twenty Years of Site Reliability Engineering (sre.google)」。

裡面的項目大多都會在公司成長時不斷的導入,都是夠大就會遇到的。

比較有趣的是第六條,這是唯一一條全部都用大寫字母列出來的:

COMMUNICATION CHANNELS! AND BACKUP CHANNELS!! AND BACKUPS FOR THOSE BACKUP CHANNELS!!!

到 Google 這個規模的架構,這邊就會規劃找完全獨立於 Google 架構的方案來用;我猜應該是傳統的 colocation 機房 (像是 AT&T 之類的),上面跑 IRC server 之類的?

在 Hacker News 上面也有其他人提到 Netflix 也有類似的規劃,需要有一個備援的管道是完全獨立於 AWS 的;另外同一則 comment 裡也有提到 Reddit 的作法是在辦公室裡面放 IRC server 備援:

Yes! At Netflix, when we picked vendors for systems that we used during an outage, we always had to make sure they were not on AWS. At reddit we had a server in the office with a backup IRC server in case the main one we used was unavailable.

IRC 還是很好用的 XD

AWS 要在歐洲建立一個完全獨立的 Cloud 系統

CNBC 上看到的新聞,AWS 打算在歐洲建立一個完全獨立的 Cloud 系統:「Amazon launches European ‘sovereign’ cloud as EU data debate rages」。

AWS European Sovereign Cloud 會是完全獨立的 cloud:

Amazon on Wednesday said it will launch an independent cloud for Europe aimed at companies in highly-regulated industries and the public sector.

這邊講的「完全獨立」,除了東西都放在歐洲以外,連員工都是歐盟員工:

Customers of the new system will be able to keep certain data in the European Union and only EU-resident AWS employees who are located in the 27-nation bloc will have control of the operations and support for the sovereign cloud.

這個作法倒是頗特別的,看起來是想要試著說服歐盟這樣是 OK 的?推出的時候可以看看還有什麼特別的東西?

Amazon RDS 的 TLS 連線所使用的 CA 要更新了

Amazon RDSTLS (SSL) 連線所使用的 CA 要更新了:「Rotate Your SSL/TLS Certificates Now – Amazon RDS and Amazon Aurora Expire in 2024」。

如果沒有開 TLS 連線的話是不受影響 (像是內網裸奔),但如果有在用 TLS 的話就要注意一下了,看起來得手動更新處理。

比較特別的是新的 CA 簽的超長:

Most SSL/TLS certificates (rds-ca-2019) for your DB instances will expire in 2024 after the certificate update in 2020. In December 2022, we released new CA certificates that are valid for 40 years (rds-ca-rsa2048-g1) and 100 years (rds-ca-rsa4096-g1 and rds-ca-ecc384-g1). So, if you rotate your CA certificates, you don’t need to do It again for a long time.

現有的 rds-ca-2019 可以在 https://s3.amazonaws.com/rds-downloads/rds-ca-2019-root.pem 這邊取得,用 openssl x509 -in rds-ca-2019-root.pem -text 可以看到資料。

crt.sh 上翻過一些字串,沒看到被簽的記錄,所以看起來無法透過一般 trusted store 裡面的 Root CA 一路信任下來。

新的 key 應該也是 Private Root CA,從名字看起來應該是對應的 key algorithm。其中 RSA 2048 的簽了 40 年,而 RSA 4096 與 ECC 384 的簽了 100 年,雖然說是自家弄的 CA,但目前的 compliance 沒有要求 key rotation 嗎...

Anyway,常用的區域基本上都是 August 22, 2024 這個日期,大約還有九個多月的時間更新,依照 AWS 的慣例,後面應該還會提醒幾次:

話說 2020 年的時候也有更新,當時是 Jeff Barr 出來說明的:「Urgent & Important – Rotate Your Amazon RDS, Aurora, and Amazon DocumentDB (with MongoDB compatibility) Certificates」,現在看起來一些常態性的說明都陸續交棒給 Channy Yun 了...

不過這次這樣搞 40 年 & 100 年,後續要更新應該都是演算法的推進了,比較不會是要到期...

Cavium (被 Marvell 併購) 在 Snowden leak 中被列為 SIGINT "enabled" vendor

標題可能會有點難懂,比較簡單的意思就是在 Snowden 當年 (2013) 洩漏的資料裡面發現了不太妙的東西,發現 Cavium (現在的 Marvell) 的 CPU 有可能被埋入後門,而他們家的產品被一堆廠商提供的「資安產品」使用。

出自 X (Twitter) 上面提到的:

這段出可以從 2022 年的「Communication in a world of pervasive surveillance」這份文件裡面找到,就在他寫的 page 71 (PDF 的 page 90) 的 note 21:

While working on documents in the Snowden archive the thesis author learned that an American fabless semiconductor CPU vendor named Cavium is listed as a successful SIGINT "enabled" CPU vendor. By chance this was the same CPU present in the thesis author’s Internet router (UniFi USG3). The entire Snowden archive should be open for academic researchers to better understand more of the history of such behavior.

Ubiquiti 直接中槍...

而另一方面,在 Hacker News 上的討論「Snowden leak: Cavium networking hardware may contain NSA backdoor (twitter.com/matthew_d_green)」就讓人頭更痛了,像是當初 Cavium 就有發過新聞稿提到他們是 AWS CloudHSM 的供應商:「Cavium's LiquidSecurity® HSM Enables Hybrid Cloud Users to Synchronize Keys Between AWS CloudHSM and Private Clouds」。

而使用者也確認有從 log 裡面看到看到 Cavium 的記錄:

Ayup. We use AWS CloudHSM to hold our private signing keys for deploying field upgrades to our hardware. And when we break the CI scripts I see Cavium in the AWS logs.

Now I gotta take this to our security team and figure out what to do.

居然是 CloudHSM 這種在架構上幾乎是放在 root of trust 上的東西...

IPv6 Excuse Bingo (IPv6 理由伯賓果?)

在「AWS IPv4 Estate Now Worth $4.5B (toonk.io)」這邊的討論意外的看到「IPv6 Excuse Bingo」這個網站...

這個 bingo 是動態的,每次 reload 都會有不同的版本出來,理由超多...

不過實際用 IPv6 network 後會發現各種鳥問題真的多,之前 (到現在) 最經典的就是 HECogent 的 IPv6 network 因為錢的問題談不攏而不通的問題,在維基百科上面就有提到從 2009 年開始就不通了:

There is a long-running dispute between the provider Cogent Communications and Hurricane Electric. Cogent has been refusing to peer settlement-free with Hurricane Electric since 2009.

所以夠大的服務如果要弄 IPv6 都得注意到這點,像是 CDN 或是 GeoIP-based load balancer 就不能把 Cogent 的用戶導到 HE 的位置上面,反之亦然。

不過話說 AWS 手上有 128M 個 IPv4 address,整個 IPv4 address 的空間也才 4.29B,也就是說光 AWS 手上就有大約 3% 的 IPv4 address 空間,如果扣掉不可用的區段的話就更高了...

EC2-Classic 完全退役

Amazon 家的老大 (CTO & VP) Werner Vogels 貼了關於 EC2-Classic 完全退役的文章:「Farewell EC2-Classic, it’s been swell」。

2021 年的時候 AWS 的 Jeff Barr 宣布了 EC2-Classic 的退役計畫:「EC2-Classic Networking is Retiring – Here’s How to Prepare」,我當時也整理了「AWS 宣佈 EC2-Classic 退役的計畫」這篇。

當時的時間表是期望在 2022/08/15 全部退役:

On August 15, 2022 we expect all migrations to be complete, with no remaining EC2-Classic resources present in any AWS account.

但後來還是晚了整整一年,到 2023/08/15 (剛好晚了一年) 才全部退役:

On August 15, 2023, we shut down the last instance of Classic.

而公告上面的更新則是在 2023/08/23 更新:

Update (August 23, 2023) – The retirement announced in this blog post is now complete. There are no more EC2 instances running with EC2-Classic networking.

因為真的太久沒用了,看了 Werner Vogels 的描述才能回想起來當時的架構,似乎是有一大鍋這件事情,靠 security group 拆開大家:

When we launched EC2 in 2006, it was one giant network of 10.2.0.0/8. All instances ran on a single, flat network shared with other customers. It exposed a handful of features, like security groups and Public IP addresses that were assigned when an instance was spun up.

順便提一下 Werner Vogels 文章的開頭提到了 AWS 很少將服務退役,即使是 2007 年推出的 Amazon SimpleDB 也還是繼續在跑,即使現在主推的是 DynamoDB

Retiring services isn’t something we do at AWS. It’s quite rare. Companies rely on our offerings – their businesses literally live on these services – and it’s something that we take seriously. For example SimpleDB is still around, even though DynamoDB is the “NoSQL” DB of choice for our customers.

用「List of AWS Services Available by Region」這頁查了一下,SimpleDB 的區域意外的還不少,在 us-east-1us-west-1us-west-2ap-southeast-1ap-southeast-2ap-northeast-1eu-west-1 以及 sa-east-1

不過話說這開頭是不是在偷臭隔壁棚 XDDD