Amazon RDS for PostgreSQL 可以掛 155 台 Read Replica

看到 AWS 推出的新「功能」,可以讓 Amazon RDS for PostgreSQL 的 read replica 掛到 155 台:「Amazon RDS for PostgreSQL supports cascaded read replicas for up to 30X more read capacity」。

作法是透過三層架構,每台機器可以堆五台 replica:

Amazon Relational Database Service (Amazon RDS) for PostgreSQL announces support for PostgreSQL 14 with three levels of cascaded read replicas, 5 replicas per instance, supporting a maximum of up to 155 read replicas per source instance.

需要 PostgreSQL 14.1 或是之後的版本:

Starting with Amazon RDS for PostgreSQL 14.1 and higher, read intensive workloads such as data analytics can now benefit from up to 155 cascaded read replicas that offer up to 30 times higher read capacity versus previous versions of PostgreSQL, thereby reducing the load on source instance.

我記得 Amazon RDS for PostgreSQL 的 replica 是 EBS block-level replication,這種搞法還蠻有趣的 XDDD

Brendan Gregg 加入 Intel

先前有提到 Brendan Gregg 離開了 Netflix 的事情:「Brendan Gregg 離開 Netflix」,剛剛看到他發表他去 Intel 的消息:「Brendan@Intel.com」。

看起來是任何跟 performance 有關的都可以碰,但主力會放在跟 cloud 相關的產品線上:

I'm thrilled to be joining Intel to work on the performance of everything, apps to metal, with a focus on cloud computing.

不知道起頭會碰什麼東西,也許先花幾個月熟悉?

Amazon SESv2 的 Deliverability Dashboard

其實是看到「Amazon SES V2 now supports email size of up to 40MB for inbound and outbound emails by default」這篇才注意到寄信的 Amazon SES 服務有了 SESv2,原文主要是講放寬信件的大小限制:

With this launch, the default message size limit in Amazon SES V2 increases from 10MB for email sending and 30MB for email receiving, to 40MB for both sending and receiving .

不過我跑去「Amazon SES pricing」看的時候意外翻到這個貴貴的東西:

The Deliverability Dashboard (via the SES API V2) is available for a fixed price of USD $1,250 per month. This charge includes reputation monitoring for up to five domains and 25 predictive email placement tests.

然後我試著去找 Deliverability Dashboard 是什麼,卻沒有專文介紹?(還是我找錯關鍵字...)

倒是在 2018 年的時候 Amazon Pinpoint 有個公告提到 Deliverability Dashboard,價錢也是 US$1,250/mo:「Amazon Pinpoint Announces a New Email Deliverability Dashboard to Help Customers Reach their Users' Inboxes」。

本來以為是 Amazon Pinpoint 的服務轉移掛到 SESv2 下,但看「Amazon Pinpoint Pricing」這邊,好像還是在啊...

雖然用不太到,但還是一頭霧水 XDDD

Slack 在 2022/02/22 發生的 downtime 說明

Slack 針對今年年初的爆炸提出了說明:「Slack’s Incident on 2-22-22」,但真正的重點都在 Hacker News 的討論串上:「Slack’s Incident on 2-22-22 (slack.engineering)」。

大概有三件事情可以講,第一個是掛掉的原因,第二個是剛發出來的時候,一堆人對於標題用的「2-22-22」很感冒,第三個是剛剛 (一個小時前),Cal Henderson (Slack 的 CTO) 跑到 Hacker News 的討論串上回應...

Downtime 的部份

這次的 downtime 主要是發生在 Group Direct Message (GDM) 的部份:

A significant element of the datastore load appeared to be from a query that listed Group Direct Message (GDM) conversations by user. This operation is fronted by our cache tier, so the high query load seemed to indicate something was wrong with our caches.

這個 GDM 的查訊效率不高,而是靠 cache layer 撐住的,加上二月 22 日那天他們在更新 Consul 的 agent,導致 hit rate 的下降,以及遇到一個比較大的 peak time,接著就壓垮了資料庫。

oh,這中間還有 Vitess 一起進來打架,原文講的比較清楚,但需要花一些時間看。

2-22-22

剛發表出來的時候,其實大多數的討論反而是在討論「2-22-22」這件事情,這的確是很差的表示方法,尤其對於一份公告來說,不過這個問題本來就是個 flame war 等級的話題...

Slack CTO (Cal Henderson) 的回應

在重刷頁面的時候發現 iamcal 這個帳號的回應,而 Cal Henderson (Slack CTO) 的個人網站是 www.iamcal.com,雖然不確定這是不是本人帳號,但看起來之前在 2011 註冊後都沒動...

這個帳號回了兩個訊息,一個是提到 AWS 上其實很常看到 failure,需要靠本身架構的穩定性來撐:

Our underlying hardware (AWS) is nothing like this reliable. We see regular (several times a year) failure of racks of machines or whole DCs.

Across the whole fleet (all services), we lose 1-10 servers per day as a baseline. Major events are then on top of that and can impact thousand of hosts at once.

另外一個是反駁自以為的量級估算:

> Even the largest Slack instance probably has under 100,000 users and less than 1000 peak messages per second.

This is not true, by an order of magnitude.

好像還可以繼續在盯一下,不知道還會不會有回應...

跨雲端的 Zero Downtime 轉移

看到「Ask HN: Have you ever switched cloud?」這個討論,在講雲端之間的搬遷,其中 vidarh 的回答可以翻一下...

首先是他提到原因的部份,基本上都是因為錢的關係,從雲搬到另外一個雲,然後再搬到 Dedicated Hosting 上:

Yes. I once did zero downtime migration first from AWS to Google, then from Google to Hetzner for a client. Mostly for cost reasons: they had a lot of free credits, and moved to Hetzner when they ran out.

Their savings from using the credits were at least 20x what the migrations cost.

然後他也直接把整理的資料丟出來,首先是在兩端上都先建立 load balancer 類的服務:

* Set up haproxy, nginx or similar as reverse proxy and carefully decide if you can handle retries on failed queries. If you want true zero-downtime migration there's a challenge here in making sure you have a setup that lets you add and remove backends transparently. There are many ways of doing this of various complexity. I've tended to favour using dynamic dns updates for this; in this specific instance we used Hashicorp's Consul to keep dns updated w/services. I've also used ngx_mruby for instances where I needed more complex backend selection (allows writing Ruby code to execute within nginx)

再來是打通內網,其實就是 site-to-site VPN:

* Set up a VPN (or more depending on your networking setup) between the locations so that the reverse proxy can reach backends in both/all locations, and so that the backends can reach databases both places.

然後建立資料庫的 replication server 以及相關的機制:

* Replicate the database to the new location.

* Ensure your app has a mechanism for determining which database to use as the master. Just as for the reverse proxy we used Consul to select. All backends would switch on promoting a replica to master.

* Ensure you have a fast method to promote a database replica to a master. You don't want to be in a situation of having to fiddle with this. We had fully automated scripts to do the failover.

然後是確認 application 端可以切換自如:

* Ensure your app gracefully handles database failure of whatever it thinks the current master is. This is the trickiest bit in some cases, as you either need to make sure updates are idempotent, or you need to make sure updates during the switchover either reliably fail or reliably succeed. In the case I mentioned we were able to safely retry requests, but in many cases it'll be safer to just punt on true zero downtime migration assuming your setup can handle promotion of the new master fast enough (in our case the promotion of the new Postgres master took literally a couple of seconds, during which any failing updates would just translate to some page loads being slow as they retried, but if we hadn't been able to retry it'd have meant a few seconds downtime).

然後確認新的雲端有足夠的 capacity 撐住流量後,就是要轉移了,首先是降低 DNS TTL:

Once you have the new environment running and capable of handling requests (but using the database in the old environment):

* Reduce DNS record TTL.

然後把舊的 load balancer 指到新的後端,這時候如果發現問題可以快速 rollback 回來:

* Ensure the new backends are added to the reverse proxy. You should start seeing requests flow through the new backends and can verify error rates aren't increasing. This should be quick to undo if you see errors.

接著把 DNS 指到新的 load balancer,理論上應該不會有太大問題:

* Update DNS to add the new environment reverse proxy. You should start seeing requests hit the new reverse proxy, and some of it should flow through the new backends. Wait to see if any issues.

接著把資料庫切到新的機房,有問題時可以趕快切回去再確認哪邊有狀況:

* Promote the replica in the new location to master and verify everything still works. Ensure whatever replication you need from the new master works. You should now see all database requests hitting the new master.

最後的階段就是拔掉舊的架構:

* Drain connections from the old backends (remove them from the pool, but leave them running until they're not handling any requests). You should now have all traffic past the reverse proxy going via the new environment.

* Update DNS to remove the old environment reverse proxy. Wait for all traffic to stop hitting the old reverse proxy.

* When you're confident everything is fine, you can disable the old environment and bring DNS TTL back up.

其實這個方法跟雲端沒什麼關係,以前搞機房搬遷的時候應該都會規劃過類似的方案,大方向也都類似 (把 stateful services 與 stateless services 拆開來分析),只是不像雲端的彈性租賃,硬體要準備比較多...

我記得當年 Instagram 搬進 Facebook 機房的時候也有類似的計畫,之前有提過:「Instagram 從 AWS 搬到 Facebook 機房」。

台灣最近的話,好像是 PChome 24h 有把機房搬到 GCP 上面?看看他們之後會不會到 GCP 的場子上發表他們搬遷的過程...

這次 Jira 雲端版相關的服務炸鍋的情況 (還在進行中...)

Atlassian 最近好像把 Jira 雲端版相關的服務給炸了,本來想等到差不多告一段落再來看看發生什麼事情,直到看到這則推說預估還要兩個星期,看起來還是先寫下來好了,不然會忘記...:

在「Multiple sites showing down/under maintenance」這邊可以看到從清明節開始炸,到昨天的報告裡面可以看到受到影響的客戶裡面他們只恢復了 35%:

A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.

Posted 19 hours ago. Apr 11, 2022 - 08:27 UTC

所以炸掉一個禮拜後大概恢復 1/3,所以的確官方預估還需要兩個禮拜應該差不多?另外在 Hacker News 上也有炸鍋的討論:「Atlassian products have been down for 4 days (atlassian.com)」。

另外在 The Register 上也有一系列的報導,裡面透漏的比官方的更多:「Atlassian Jira, Confluence outage persists two days on」、「Atlassian outage lingers, sparking data loss fears」、「Day 7 of the great Atlassian outage: IT giant still struggling to restore access」、「At last, Atlassian sees an end to its outage ... in two weeks」。

第一篇的副標題有提到原因:

'Routine maintenance script' blamed for derailed service for unlucky customers

第二篇則是提到大約 400 個客戶受到影響:

We were also told that the incident affects a relatively small number of Atlassian customers: about 400. That's only 0.18 per cent of the company's 226,000 customers, which isn't much consolation to the several hundred who still can't access their data.

之後再回頭來看所謂的 routine maintenance script 是什麼好了...

AWS Lambda 可以直接有 HTTPS Endpoint 了

AWS 宣佈 AWS Lambda 可以直接有一個 HTTPS Endpoint 了:「Announcing AWS Lambda Function URLs: Built-in HTTPS Endpoints for Single-Function Microservices」。

如同文章裡面提到的,先前得透過 API Gateway 或是 ALB 才能掛上 Lambda:

Each function is mapped to API endpoints, methods, and resources using services such as Amazon API Gateway and Application Load Balancer.

現在則是提供像 verylongid.lambda-url.us-east-1.on.aws 這樣的網域名稱給你用,而且看說明似乎是直接包含在本來的 Lambda 價錢內?就不用另外搞 API Gateway 或是 ALB 了:

Function URLs are included in Lambda’s request and duration pricing. For example, let’s imagine that you deploy a single Lambda function with 128 MB of memory and an average invocation time of 50 ms. The function receives five million requests every month, so the cost will be $1.00 for the requests, and $0.53 for the duration. The grand total is $1.53 per month, in the US East (N. Virginia) Region.

這讓我想到可以用 Lambda 當特製的 HTTP proxy 的專案,好像可以拿來整到 feedgen 裡面用?

PHP (以及 Laravel) 下使用 DynamoDB 的 ORM 工具

Twitter 上看到「Laravel DynamoDB Eloquent Models and Query Builder」這篇文章,裡面講「Laravel DynamoDB」這個套件,可以在 PHP (以及 Laravel) 下存取 DynamoDB

雖然套件提到了 Laravel,但文件裡面也有提到支援非 Laravel 的 PHP 環境下使用,單獨拿出來用也沒問題,比較重要的反倒是 DynamoDB 對各種 key 的概念。

如果是從零開始設計,但又不想要自己管資料庫,我會偏好先用 RDS 設計,無論是 MySQL 或是 PostgreSQL 的版本都行,畢竟 RDBMS 上面能做的事情比較多,對開發者比較友善,除非是第一天上線你就預期量會大到連 db.m5.24xlarge 都擋不住之類的情況...

AWS 將會把超過兩年的 EC2 AMI 自動設為 Deprecated

AWS 的公告,超過兩年的 EC2 AMIs (Amazon Machine Images) 將會被標為 deprecated:「Amazon EC2 now reduces visibility of public Amazon Machine Images (AMIs) older than two years」。

標成 deprecated 後主要的差異會是在 DescribeImages 這隻 API 上,除了 image 的擁有人外,其他人都不會顯示出來:

Once an AMI is deprecated, it will no longer appear in DescribeImages API calls for users that aren’t the owner of the AMI.

不過知道 AMI 的 id 還是可以直接開:

Users of a deprecated AMI can continue to launch instances and describe the deprecated AMI using its ID.

沒有特地說明原因,但應該是考慮到安全性,這年頭超過兩年不更新的系統大概都有一堆洞?不過馬上就想到 OpenBSD 好像未必...

Amazon RDS 的 Free Tier 方案包含了 db.t3.micro 與 db.t4g.micro

AWSdb.t3.microdb.t4g.micro 都放進 free tier 了:「Amazon RDS Free Tier now includes db.t3.micro, AWS Graviton2-based db.t4g.micro instances in all commercial regions」。

Customers new to AWS in the past 12 months and who were in regions where db.t2.micro was not available can now create free tier db.t3.micro or db.t4g.micro instances for the remainder of their first 12 months.

看說明是註冊的 12 個月內有這個方案可以用,可以拿來跑一些小東西...