SendGrid 意外的被幹翻...

看到 Hacker News 上的「Ask HN: Great tools for solo SaaS founders?」這則,在討論有哪些服務好用的,有人提到了 SendGrid 做為 email 發送服務,結果沒想到下面一堆人幹翻 XDDD

蠻多人推薦 Postmark 的,另外有人提到 SparkPost

另外可以看一下「Hacker News Tools of the Trade」,之前要找工具都會往這邊翻翻...

SQLite 3.37.0 以及 STRICT table 的設計

Hacker News 首頁上看到「SQLite Release 3.37.0 (sqlite.org)」,原文在「SQLite Release 3.37.0 On 2021-11-27」這邊。

這個版本引入了 STRICT Tables,先前在「SQLite 目前在規劃的 Strict Table,以及我從來不知道原來可以這樣惡搞...」這邊有提過。

官方給出來的範例是這樣,如果沒有要求 STRICT 的話,可以看到各種變化:

CREATE TABLE t1(a ANY);
INSERT INTO t1 VALUES('000123');
SELECT typeof(a), quote(a) FROM t1;
-- result: integer 123

加上 STRICT 後就會與「預期」的結果比較接近:

CREATE TABLE t1(a ANY) STRICT;
INSERT INTO t1 VALUES('000123');
SELECT typeof(a), quote(a) FROM t1;
-- result: text '000123'

對於希望 database 在處理資料嚴謹一點的人來說,應該是個不錯的新功能,但畢竟不是預設值,對於剛跨進來用的人應該還是有中獎機會 XD

把 YouTube 的 Dislike 數字弄回來

最近 YouTube 也在搞事,把 Dislike 的數字拔掉了,後來在 Greasy Fork 上面找了一下,看到有兩套方法可以把數字補回來。

第一套是「Return YouTube Dislike」這個方法,從程式碼裡面可以看到是透過 API 拉出來的:

function setState() {
  cLog('Fetching votes...');
 
  doXHR({
    method: "GET",
    responseType: "json",
    url:
      "https://return-youtube-dislike-api.azurewebsites.net/votes?videoId=" +
      getVideoId(),
    onload: function (xhr) {
      if (xhr != undefined) {
        const { dislikes, likes } = xhr.response;
        cLog(`Received count: ${dislikes}`);
        setDislikes(numberFormat(dislikes));
        createRateBar(likes, dislikes);
      }
    },
  });
}

這個 API 後面應該是接 Videos: getRating 拉資料出來,但畢竟不是直接打 YouTube API (比較麻煩,需要每個使用者自己申請 API token),這樣就有隱私的疑慮了...

另外一套是「Show Youtube Dislike Count」,看了裡面程式碼發現他是用 averageRating 反推回來:

if (likeCount >= 0) {
    const r = data.playerResponse.videoDetails.averageRating;
    const dislikeCount = Math.round(likeCount * (5 - r) / (r - 1));

    ShowDislikes(likeCount, dislikeCount);
}

不過作者有點偷懶,這邊在等待頁面生成單純用 100ms 等頁面出現,有時候還是會有 race condition (就是後面還是讀不到 XDDD),如果懶的大修的話可以改成 1000ms 混過去,降低一些機率:

while (!isLoaded) {
    await Sleep(100);
}

另外數字很大的時候會稍微不準,但也算夠用了,先暫時用這套來頂著了...

Amazon RDS 支援 readonly instance 當作 Multi AZ 的機器了

從來沒在用 RDS 的 Multi AZ,所以根本沒注意到居然沒這個功能:「New Multi-AZ deployment option for Amazon RDS for PostgreSQL and for MySQL; increased read capacity, lower and more consistent write transaction latency, and shorter failover time (Preview)」。

看起來 (加上印象中) 之前的 Multi AZ 是另外一台機器先開著但不能用:

In the case of an infrastructure failure, Amazon RDS performs an automatic failover to the standby, so that database operations resume as soon as the failover is complete.

現在則是開著的機器可以跑 readonly 模式:

The standby DB instances act as automatic failover targets and can also serve read traffic to increase throughput without needing to attach additional read replica DB instances.

這樣做除了省成本外,另外因為這些 instance 平常就有 query 的量,當真的遇到 failover 切換時,warmup 的時間也會短很多 (尤其是服務夠大的時候)。

不過有些限制,首先看起來只支援 Graviton2 (ARM-based) 的機種?

The readable standby option for Amazon RDS Multi-AZ deployments works with AWS Graviton2 R6gd and M6gd DB instances (with NVMe-based SSD instance storage) and Provisioned IOPS Database Storage.

然後是支援的區域:

The Preview is available in the US East (N. Virginia), US West (Oregon), and Europe (Ireland) regions.

以及夠新的版本,MySQL 8 與 PostgreSQL 13.4 才有提供:

Amazon RDS for MySQL supports the Multi-AZ readable standby option for MySQL version 8.0.26. Amazon RDS for PostgreSQL supports the Multi-AZ readable standby option for PostgreSQL version 13.4.

但看起來還不錯,畢竟這比較接近以前在地端機房時的作法...

來看 Intel + Varnish 的單機 500Gbps 的 PR 新聞稿

在「Varnish Software Achieves 500Gbps Throughput Per Server for UHD Video Content」這邊看到 PR 稿,由 IntelVarnish 合作,宣稱達到單機 500Gbps 的 throughput 了:

According to Varnish Software, the following were the outcomes of the test:

  • 509.7 Gbps live-linear throughput, using a dual-processor configuration
  • 487.2 Gbps video-on-demand throughput, using a dual-processor configuration

白皮書在「Delivering up to 500 Gbps Throughput for Next-Gen CDNs」這頁可以用個資交換下載,不過用搜尋引擎找一下可以發現 Intel 那邊有放出 PDF (但不確定兩邊給的是不是同一份):「Delivering up to 500 Gbps Throughput for Next-Gen CDNs」。

單 CPU 的伺服器是四個 100Gbps 界面接出來,雙 CPU 的伺服器是八個 (這邊 SUT 是 system under test 的縮寫):

These client systems were connected to the CDN servers using 100 GbE links through a switch; 4x100 GbE connections for the single-processor SUT, and 8x100 GbE for the dualprocessor SUT. Testing was done using Wrk, a widely recognized open-source HTTP(S) benchmarking tool.

不過如果實際看圖會發現伺服器是兩個 100Gbps (單 CPU) 與四個 100Gbps (雙 CPU),然後 wrk 也吃了兩個或是四個 100Gbps:

在白皮書最後面也有提到測試的配置,都是在 Ubuntu 20.04 上面跑,單 CPU 用的是兩張 Intel 的 100Gbps 網卡,雙 CPU 的用的是四張 Mellanox 的 100Gbps 網卡:

3rd generation Intel Xeon Scalable testing done by Intel in September 2021. Single processor SUT configuration was based on the Supermicro SMC 110P-WTR-TNR single socket server based on Intel® Xeon® Platinum 8380 processor (microcode: 0xd000280) with 40 cores operating at 2.3 GHz. The server featured 256 GB of RAM. Intel® Hyper-Threading Technology was enabled, as was Intel® Turbo Boost Technology 2.0. Platform controller hub was the Intel C620. NUMA balancing was enabled. BIOS version was 1.1. Network connectivity was provided by two 100 GbE Intel® Ethernet Network Adapters E810. 1.2 TB of boot storage was available via an Intel SSD. Application storage totaled 3.84TB per drive and was provided by 8 Intel P5510 SSDs. The operating system was Ubuntu Linux release 20.04 LTS with kernel 5.4.0-80 generic. Compiler GCC was version 9.3.0. The workload was wrk/master (April 17, 2019), and the version of Varnish was varnishplus-6.0.8r3. Openssl v1.1.1h was also used. All traffic from clients to SUT was encrypted via TLS.

3rd generation Intel Xeon Scalable testing done by Intel in September 2021. Dual processor SUT configuration was based on the Supermicro SMC 22OU-TNR dual socket server based on Intel® Xeon® Platinum 8380 processor (microcode: 0xd000280) with 40 cores operating at 2.3 GHz. The server featured 256 GB of RAM. Intel® Hyper-Threading Technology was enabled, as was Intel® Turbo Boost Technology 2.0. Platform controller hub was the Intel C620. NUMA balancing was enabled. BIOS version was 1.1. Network connectivity was provided by four 100 GbE Mellanox MCX516A-CDAT adapters. 1.2 TB of boot storage was available via an Intel SSD. Application storage totaled 3.84TB per drive and was provided by 12 Intel P5510 SSDs. The operating system was Ubuntu Linux release 20.04 LTS with kernel 5.4.0-80- generic. Compiler GCC was version 9.3.0. The workload was wrk/master (April 17, 2019), and the version of Varnish was varnish-plus6.0.8r3. Openssl v1.1.1h was also used. All traffic from clients to SUT was encrypted via TLS.

不過馬上就會滿頭問號,四張 100Gbps 是怎麼跑到 500Gbps 的頻寬...

這份 PR 馬上就讓人想到 Netflix 先前放出來的投影片 (先前有在「Netflix 在單機服務 400Gbps 的影音流量」這篇提到),在 Netflix 的投影片裡面有提到他們在 Intel 平台上面受限於記憶體的頻寬,整台機器只能跑到 230Gbps。

另外一種猜測是,如果 Intel 與 Varnish 宣稱的 500Gbps 是算 switch 上的總流量 (有這樣算的嗎,你是 Juniper 嗎...),那這邊的 500Gbps 換算回去差不多就是減半 (還很客氣的沒把 cache 沒中需要去 origin server 拉資料的流量扣掉),跟 Netflix 在 FreeBSD 上跑出來的結果差不多啊...

坐等反駁 XDDD

Amazon VPC 支援純 IPv6 的網段了

Amazon VPC 支援純 IPv6 的網段了:「Amazon Virtual Private Cloud (VPC) customers can now create IPv6-only subnets and EC2 instances」。

先前機器都還是要設一個 IPv4 位置,所以網段都必須有 IPv4 network space,這次推出使得機器可以跑在 IPv6-only network 上了,不過 Linux 裡面應該還是會有個 lo127.0.0.1...

短時間應該用不到,不過可以先玩看看感覺一下...

QOI 圖片無損壓縮演算法

Hacker News Daily 上看到「Lossless Image Compression in O(n) Time」這篇,作者丟出了一個圖片的無損壓縮演算法,壓縮與解壓縮的速度超快,但壓縮率又不輸 PNG 太多,在 Hacker News 上的討論也可以看一下:「QOI: Lossless Image Compression in O(n) Time (phoboslab.org)」。

裡面有提到在遊戲產業常用到的 stb_image.h

Yes, stb_image saved us all from the pains of dealing with libpng and is therefore used in countless games and apps. A while ago I aimed to do the same for video with pl_mpeg, with some success.

作者的簡介也可以看到他的主業也在遊戲這塊:

My name is Dominic Szablewski. I build games, experiment with JavaScript and occasionally tinker with low-level C.

圖片的無損壓縮與解壓縮算是遊戲創作者蠻常用到的功能,所以他想要看看這塊有沒有機會有更好的工具,於是他就用了四個很簡單的演算法幹完了 QOI (然後發現效果很讚):

  • A run of the previous pixel
  • An index into a previously seen pixel
  • The difference to the previous pixel
  • Full rgba values

其實從 Hacker News 的討論也可以看到這組演算法也常被拿出來在現代的壓縮演算法使用,所以雖然作者自稱不是 compression guy,但他用的演算法其實蠻專業的...

然後挑 single thread 主要是可以避免 threading 的複雜度以及 overhead,在「QOI Benchmark Results」這頁可以看到,無論是什麼類型的檔案,壓縮與解壓縮的速度都相當漂亮,而且壓縮率又沒有差 libpng 太多。

而且作者自己有提到,還沒用到 SIMD 指令集加速,這樣猜測應該還有不少空間...

AWS 流量相關的 Free Tier 增加不少...

Jeff Barr 出來公告增加 AWS 流量相關的 free tier:「AWS Free Tier Data Transfer Expansion – 100 GB From Regions and 1 TB From Amazon CloudFront Per Month」。

一般性的 data transfer 從 1GB/month/region 變成 100GB/mo,現在是 21 regions 所以不會有反例,另外大多數的人或是團隊也就固定用一兩個 region,這個 free tier 大概可以省個 $10 到 $20 左右?

Data Transfer from AWS Regions to the Internet is now free for up to 100 GB of data per month (up from 1 GB per region). This includes Amazon EC2, Amazon S3, Elastic Load Balancing, and so forth. The expansion does not apply to the AWS GovCloud or AWS China Regions.

另外是 CloudFront 的部份,本來只有前 12 個月有 free tier,現在是開放到所有帳號都有,另外從 50GB/month 升到 1TB/month,這個部份的 free tier 就不少了,大概是 $100 到 $200?

Data Transfer from Amazon CloudFront is now free for up to 1 TB of data per month (up from 50 GB), and is no longer limited to the first 12 months after signup. We are also raising the number of free HTTP and HTTPS requests from 2,000,000 to 10,000,000, and removing the 12 month limit on the 2,000,000 free CloudFront Function invocations per month. The expansion does not apply to data transfer from CloudFront PoPs in China.

今年十二月才生效,要注意一下不要現在就用爽爽:

This change is effective December 1, 2021 and takes effect with no effort on your part.

這樣好像可以考慮把 blog 與 wiki 都放上去玩玩看,目前這兩個服務都是用 Cloudflare 的 free tier,HiNet 用戶基本上都是連去 Cloudflare 的美西 PoP,偶而離峰時間會用亞洲的點,但都不會是台灣的 PoP...

不過記得之前 WordPress + CloudFront 有些狀況,再研究看看要怎麼弄好了...

使用 heredoc 語法的 Dockerfile

Simon Willison 這邊看到的,在 Dockerfile 裡面使用 heredoc 語法編 Docker image:「Introduction to heredocs in Dockerfiles」,引用的文章是「Introduction to heredocs in Dockerfiles」與「Introduction to heredocs in Dockerfiles」,七月的事情了。

heredoc 指的是可以讓開發者很方便使用多行結構,在 Dockerfile 這邊常見到的 pattern:

RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get install -y ...

但這樣會產生出很多層 image,所以先前的 best practice 是:

RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y ...

而 heredoc 的導入簡化了不少事情,這應該有機會成為新的 best practice:

RUN <<EOF
apt-get update
apt-get upgrade -y
apt-get install -y ...
EOF

要注意的是,開頭要記得加上 #syntax 的宣告,用到 docker/dockerfile:1.3-labs 這組才能使用 heredoc:

# syntax=docker/dockerfile:1.3-labs

然後用 buildkit 去編,用新版的 Docker 已經包 buildkit v0.9.0 進去了:

DOCKER_BUILDKIT=1 docker build .

LLVM 的更換授權進展

Hacker News Daily 上看到「LLVM relicensing update & call for help」這篇,在講 LLVM 計畫從 UIUC licenseMIT license 授權轉成 Apache License 2.0 的進展,在 Hacker News 上的討論「LLVM relicensing update and call for help (llvm.org)」也可以翻一下。

目前的規劃是這樣:

文章開頭還是先花了一些篇幅解釋,這個計畫主要是要處理專利的問題,原先的 developer policy 對於專利的句子太粗糙,會授權過多的權力給 LLVM。這對於一般個人可能影響不大,但對於手上有一卡車專利的公司來說就不太願意了。

另外一個問題是 LLVM 遇到的問題,因為 runtime library 的部份是用 UIUC license + MIT license 授權,但主體是用 UIUC license 授權,這使得主體的程式碼不能隨意搬到 runtime library 裡面:

The run time libraries were dual licensed under the UIUC and MIT license; the rest of the code only under the UIUC license. Therefore, we could not easily move code to run time libraries from other parts. The reason run time libraries were dual licensed was to enable linking to run time library binaries without requiring attribution to LLVM.

因為這些目標,所以新的授權會是 Apache License 2.0 為主,裡面有設計還算合理的專利授權條件,另外大家也算熟悉,再來是針對 object code 以及 GPLv2 設計了例外條款:

As an exception, if, as a result of your compiling your source code, portions of this Software are embedded into an Object form of such source code, you may redistribute such embedded portions in such Object form without complying with the conditions of Sections 4(a), 4(b) and 4(d) of the License.

In addition, if you combine or link compiled forms of this Software with software that is licensed under the GPLv2 ("Combined Software") and if a court of competent jurisdiction determines that the patent provision (Section 3), the indemnity provision (Section 9) or other Section of the License conflicts with the conditions of the GPLv2, you may retroactively and prospectively choose to deem waived or otherwise exclude such Section(s) of the License, but only in their entirety and only with respect to the Combined Software.

在「Long tail of individuals and corporations without a relicensing agreement yet」這邊有目前還沒有同意重新授權的人以及團隊的資料,看起來不會是每個人都願意重新授權,到時候可能還得再挑出來重寫,但有些可以獨立出來的可能可以維持,畢竟 UIUC licesne 與 MIT license 都是 permissive license,只要放到另外一個目錄下,大家知道不是 Apache License 2.0 就還好...