Cloud – Page 6 – Gea-Suan Lin's BLOG

Mountpoint for Amazon S3 正式推出 (GA) 了...

三月的時候提到 AWS 搞出了自己的 Amazon S3 的 FUSE 實作：「AWS 官方推出了自己的 Amazon S3 FUSE 套件」，現在 GA 了：「Mountpoint for Amazon S3 – Generally Available and Ready for Production Workloads」。

看起來 s3fs-fuse 還是一直有在更新，然後翻了翻好像沒看到兩者的比較... (可能是之前 Mountpoint for Amazon S3 在 alpha 版的關係？)

記得這個功能拿來塞些東西還蠻好用的...

Amazon EC2 推出新世代 Intel CPU，以及奇怪的 Flex 產品

Jeff Barr 公告了 Amazon EC2 推出了新的 Intel CPU 產品線：「New Seventh-Generation General Purpose Amazon EC2 Instances (M7i-Flex and M7i)」。

先講 m7i，這個比較好理解，就是 Intel 新的 CPU，然後很隱晦的只宣稱了「比其他雲端廠商的 Intel CPU 快 15%」：

Today we are launching Amazon Elastic Compute Cloud (Amazon EC2) M7i-Flex and M7i instances powered by custom 4th generation Intel Xeon Scalable processors available only on AWS, that offer the best performance among comparable Intel processors in the cloud – up to 15% faster than Intel processors utilized by other cloud providers.

而與前一代的 Intel 機種相比 (應該是 m6i？) 則是高了 15% 的 CP 值：

The M7i instances are available in nine sizes (with two size of bare metal instances in the works), and offer 15% better price/performance than the previous generation of Intel-powered instances.

另外這次最有趣 (但未必好用) 的是推出了 m7i-flex，宣稱再「很多情境下」比 m6i 的 CP 值高了 19%：

M7i-Flex instances are available in the five most common sizes, and are designed to give you up to 19% better price/performance than M6i instances for many workloads.

這邊有提到 m7i-flex 是把 m7i 砍了 5% 價錢：

The M7i-Flex instances are a lower-cost variant of the M7i instances, with 5% better price/performance and 5% lower prices. They are great for applications that don’t fully utilize all compute resources. The M7i-Flex instances deliver a baseline of 40% CPU performance, and can scale up to full CPU performance 95% of the time.

但在「General purpose instances」這邊比較清楚，但可以跑超過 40% CPU 的時間限制在 24 小時內的 95%：

For times when workloads need more performance, M7i-flex instances provide the ability to exceed baseline CPU and deliver up to 100 percent CPU for 95 percent of the time over a 24-hour window.

這個設計頗微妙的，旁邊 ARM-based 的 t4g 就先不提了，至少不是 drop-in replacement...

看 m7i.large (2 vCPU + 8GB RAM) 在 us-east-1 的價錢是 $0.1008/hr，你可以完全沒有阻礙的用他 100% 的 CPU。

而 m7i-flex.large 是 $0.09576/hr，剛好省 5%，代價是有 5% 的時間必須壓在 40% 以下。

而拿 t3 系列的機器來比較，t3.large 也是有 2 vCPU + 8GB RAM，但他內建的 baseline CPU 是 30%，價錢則是 $0.0832/hr。

從這邊可以看出來，大多數的小應用還是會往 t3 甚至是 t3a 丟。

只有 24 小時的平均 loading 超過 40%，但又不是 24 小時都超過 40% 的應用，也就是現在應該是在 m6a 或 m6i 上面跑的，才有可能會評估 m7i-flex？

更不用說 AWS 上 m6a 的收費比 Intel 的 m6i 少了 10% 啊，這個產品定位在細算後很微妙：應該是有他可以出現的地方，但怎麼算都不多... 有點像是 AWS 跟 Intel 交代的產品？

但順著邏輯，這個方法其實是 billing-based 的方案，跟技術沒有太多關係，如果 Intel 可以做，那麼 AMD 與 ARM 應該也遲早會出現？

看起來像是 t 系列產品的延伸，但好像可以再等等，看會不會在 AMD 與 ARM 的產品線上推出類似的東西？

Amazon SQS 提高 FIFO throughput 限制

在「Amazon SQS announces increased throughput quota for FIFO High Throughput mode」這邊看到 AWS 提高了 Amazon SQS 中 FIFO throughput 的限制，這本來是個常常有的公告，但讓我意外的是不同區域拉高的數量是不同的：

Amazon Simple Queue Service (SQS) announces an increased quota for a high throughput mode for FIFO queues, allowing you to process up to 9,000 transactions per second, per API action in US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Ireland), Europe (Frankfurt) regions. For Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo) regions, the throughput quota has been increased to 4,500 transactions per second, per API action. For all other regions where SQS is generally available today, the quota for high throughput mode quota has been increased to 2,400 transactions per second.

第一梯隊的 (像是 us-east-1、us-west-2 與 eu-west-1) 都是 9000 tps，而第二梯隊是 4500 tps，沒列在上面的區域是 2400 tps。

另外一個比較特別的是 Frankfurt 區居然在第一梯隊...

AWS 將開始收取 IPv4 的 Public IP 費用

一個蠻大的改變，AWS 宣布所有的 IPv4 address 將在明年二月開始收費：「New – AWS Public IPv4 Address Charge + Public IP Insights」。

這包括了有掛在 EC2 上面的 IPv4 address：

We are introducing a new charge for public IPv4 addresses. Effective February 1, 2024 there will be a charge of $0.005 per IP per hour for all public IPv4 addresses, whether attached to a service or not (there is already a charge for public IPv4 addresses you allocate in your account but don’t attach to an EC2 instance).

費用算是相當貴，$0.005/hr 已經比 t4g.nano 在 us-east-1 的 $0.0042/hr 還貴了。

另外一個有趣的點是 Jeff Barr 會自己貼到 Hacker News 上？在「AWS Public IPv4 Address Charge and Public IP Insights (amazon.com)」這邊可以看到。

回到原來的主題，改跑 IPv6 only 會有兩個方向的流量要解決，一個是從機器連出來的部分，另外一個是從外面連到機器的部分。

對於比較大的服務，連出來的部分是可以靠 NAT64 類的方式處理掉 (但如果用 AWS 服務的話也很貴，參考 AWS 的 DNS64 and NAT64)，或是透過 socks5 proxy 與 http proxy 解決。

而比較小的單機 (像是當 VPS 用的 EC2 instance) 似乎就沒有太好的解法了。

另外從外面連到機器的部分，如果只有 HTTP(S) protocol 還可以加減透過 CDN 解 (像是 Cloudflare 或是 AWS 本家的 CloudFront)，但 SSH 類的服務就稍微麻煩了，台灣要弄到便宜的固定 IPv6 address 有點麻煩，HiNet 的企業固定制是有對應的方案：「HiNet固定制IPv6服務說明」，但最低的 16M/3M 也要 $1292/mo (大約是 US$41/mo)，可能要找有提供固定 IPv6 的 VPS？

雖然是個很靠悲的事情，但這也讓雲端架構裡面朝 IPv6 的動力多了點...

Amazon S3 的新數字

Werner Vogels 寫了一篇在回憶 Amazon S3 的文章：「Building and operating a pretty big storage system called S3」，裡面有個是他這個層級比較容易取得公開權限的資料：

有標注「S3 by the numbers (as of publishing this post).」，所以是 2023 年七月現在的數字。

雖然很明顯的還是避開談總大小，但有提供目前的 S3 object 數量是 280 兆，以及 request 量是每秒 1 億次。

搭配之前公開過的數字 (出自維基百科上的「Amazon S3」條目)，上次公佈是在 2021 年三月的時候宣布超過 100 兆，所以過了兩年的時間已經到 280 兆了：

Amazon Web Services introduced Amazon S3 in 2006. Amazon reported it stored more than 100 trillion objects as of March 2021, up from 10 billion objects in October 2007, 14 billion objects in January 2008, 29 billion objects in October 2008, 52 billion objects in March 2009, 64 billion objects in August 2009, 102 billion objects in March 2010, and 2 trillion objects in April 2013.

CloudFront 支援 3072 bit RSA 憑證

看到 CloudFront 支援 3072 bit RSA certificate 的消息：「Amazon CloudFront announces support for 3072-bit RSA certificates」。

2048 bit 在一般情況算是夠用，畢竟現在的紀錄也才到 829 bit (參考「RSA Factoring Challenge」)：

1024-bit RSA keys are equivalent in strength to 80-bit symmetric keys, 2048-bit RSA keys to 112-bit symmetric keys, 3072-bit RSA keys to 128-bit symmetric keys, and 15360-bit RSA keys to 256-bit symmetric keys. In 2003, RSA Security claimed that 1024-bit keys were likely to become crackable some time between 2006 and 2010, while 2048-bit keys are sufficient until 2030. As of 2020 the largest RSA key publicly known to be cracked is RSA-250 with 829 bits.

但如果哪天突然又有新的演算法出來威脅到 2048 bit 的話，會多一點緩衝的空間？

SQLite 官方自己搞的 Cloud Backed SQLite

SQLite 自己搞了一套使用雲端空間為儲存空間的技術：「Cloud Backed SQLite」，對應的 Hacker News 討論可以看「Cloud Backed SQLite (sqlite.org)」這邊。

他說目前支援 Azure Blob Storage 與 Google Cloud Storage，這點比較有趣，沒有提到 Amazon S3：

The system currently supports Azure Blob Storage and Google Cloud Storage. It also features an API that may be used to implement support to other cloud storage systems.

跟之前的 sql.js 專案不太一樣，sql.js 的作法是用 HTTP range 存取現有的 SQLite 資料庫檔案，而這次的這個專案則是改變底層架構，去配合雲端環境的特點。

雲端的 storage 因為每個 access 都會有很高的 latency (相比於本地的空間)，所以要避免太多 random access，儘量以 sequential access 為主，這個特性像是以前在處理傳統磁頭硬碟時的技巧。

另外一個特點是雲端空間有多檔案的概念，所以也可以利用這個方式設計資料結構。

還蠻有趣的計畫，而且是官方搞的...

Rocky Linux 提出兩個方法取得 RHEL 的 source code

在「AlmaLinux 與 Rocky Linux 看起來都暫時無解」這邊提到了檯面上目前沒有好方法穩定取得 source code 後，Rocky Linux 提出了兩個方法，在不需要同意 RHEL 的條款下取得 RHEL 的 source code：「Keeping Open Source Open」。

中間還有一些小插曲可以提一下，在社群不少抗議聲後，IBM & Red Hat 的 VP 出來直接說他們認為 RHEL rebuild 沒有任何價值，而且是故意讓 rebuilder 更難實作 RHEL rebuild：「Red Hat’s commitment to open source: A response to the git.centos.org changes」。

Ultimately, we do not find value in a RHEL rebuild and we are not under any obligation to make things easier for rebuilders; this is our call to make.

回到 Rocky Linux 的文章，他們提出來的兩個方法都是基於 GPL 的重要性質：如果你可以合法拿到 binary，那麼散佈者就有義務要提供 source code。

第一個方法是透過 RHEL 目前公開提供的 container image：

One option is through the usage of UBI container images which are based on RHEL and available from multiple online sources (including Docker Hub). Using the UBI image, it is easily possible to obtain Red Hat sources reliably and unencumbered. We have validated this through OCI (Open Container Initiative) containers and it works exactly as expected.

另外一種方式是透過雲端服務的 cloud instance 跑 RHEL：

Another method that we will leverage is pay-per-use public cloud instances. With this, anyone can spin up RHEL images in the cloud and thus obtain the source code for all packages and errata. This is the easiest for us to scale as we can do all of this through CI pipelines, spinning up cloud images to obtain the sources via DNF, and post to our Git repositories automatically.

這兩個方法都不需要同意 RHEL 目前在網站上的 TOS 與 EULA，而且短時間內應該不好防堵：前者要關掉的話，應該有一堆既有 RHEL 客戶在用會直接抱怨，真的要硬幹的話得給這些客戶時間從 public repository 轉移到要認證的 repository 上；而後者要堵的話，除非 IBM & Red Hat 決定直接不做雲端生意？

看起來 Rocky Linux 與 AlmaLinux 用這套方法可以撐一陣子，直到 IBM & Red Hat 想出新方法來搞？

John Carmack 對於 1990 年代類神經網路沒有興起的討論...

Hacker News 上看到「Neural networks in the 1990s (twitter.com/id_aa_carmack)」這篇，原推在：

It is interesting how many old papers used neural networks with only a dozen or so units. Computers weren’t THAT slow in the 90s — BLAS (basic linear algebra subprograms) was already a thing that vendors hyper-optimized for. Not much overlap between HPC and NN people?…

— John Carmack (@ID_AA_Carmack) June 18, 2023

在 Hacker News 上的 rm999 有提到當時的結果，可以解釋為什麼在 1990 年代時類神經網路沒有興起的關係：

A lot of the problems that did benefit from neural networks in the 90s/early 2000s just needed a non-linear model, but did not need huge neural networks to do well. You can very roughly consider the first layer of a 2-layer neural network to be a series of classifiers, each tackling a different aspect of the problem (e.g. the first neuron of a spam model may activate if you have never received an email from the sender, the second if the sender is tagged as spam a lot, etc). These kinds of problems didn't need deep, large networks, and 10-50 neuron 2-layer networks were often more than enough to fully capture the complexity of the problem. Nowadays many practitioners would throw a GBM at problems like that and can get away with O(100) shallow trees, which isn't very different from what the small neural networks were doing back then.

1990 年代時的主題還是比較簡單的題目，像是分 category 這類題目 (一個常見的應用是 spam filter)，而這些題目在傳統方式與類神經網路的差異並不大。

直到後來 GPU 運算技術的成熟，而且從 2010 年有 cloud 的概念以後，一般單位可以不用花大錢自己建整套超級電腦，只需要花一些 OPEX 就可以生出小型的超級電腦 (短時間)，這讓不少單位都可以有夠大的計算力計算大型 model (相較於以前的大小)，也才看得出來大型 model 用來解更複雜問題的威力。

而 2014 年的 AlphaGo 算是一個類神經網路對一般人衝擊的成功案例 (i.e. 跨出圈子)，這也讓投資人對人工智慧的主題更願意投資。

AWS 新推出的 m7a 宣稱比 m6a 多 50% 效能？

AWS 在「Introducing Amazon EC2 M7a instances (Preview)」這邊看到 m7a 會比 m6a 快 50% 的宣稱：

These instances deliver up to 50% greater performance on average compared to M6a instances.

目前還是 preview 階段，需要申請才有機會用，所以還不知道他的真實性能是怎麼樣？另外一方面，價錢也還沒查到... 但如果價錢不要漲太多的話，算一下好像有可能跟上 ARM 的 m7g 了？

另外這樣也就蠻值得期待會不會有 t4a？