InterNetNews (INN) 在測試 Git

在 mailing list 上看到「Test Git repository for INN」這個消息,看起來 InterNetNews 也在嘗試換到 Git 上了,目前是選擇用 GitHub,在「InterNetNews/inn」這邊。

整個從 CVS 換到 Subversion 再到現在開始進入 Git 的環境了。

沒有全部自己搞的理由也有提到,主要就是免費與方便,然後社群也已經知道 GitHub 是什麼了:

GitHub was chosen because it's (a) free (as in price), (b) widely understood and used already, and (c) easy for me to set up. I know some folks have reservations about GitHub because it's not free software, and I understand, but I don't think we're committing heavily to their platform (everyone who has a clone of the Git repository has all the important data and can move it elsewhere), and other things like GitLab are only open core anyway. Hosting everything on 100% free software (depending on the choice of free software) loses us some useful features (I plan to set up GitHub Actions for CI) and requires more resources that I don't really have to spend on it. That said, everyone is certainly welcome to mirror the Git repository elsewhere if they want.

之後會跑 CI,算是這些年軟體工程必備的工具了...

GitHub Copilot 產生出來程式的安全性問題

看到「Encoding data for POST requests」這篇大家才回頭注意到 GitHub Copilot 首頁的範例本身就有安全漏洞:

async function isPositive(text: string): Promise<boolean> {
  const response = await fetch(``, {
    method: "POST",
    body: `text=${text}`,
    headers: {
      "Content-Type": "application/x-www-form-urlencoded",
  const json = await response.json();
  return json.label === "pos";

其中 text=${text} 是一個 injection 類的漏洞,首頁的範例應該是被挑過的,但仍然出現了這個嚴重的問題,從這邊可以看出 GitHubOpenAI 在這條線上的問題...

GitHub 與 OpenAI 合作推出的 GitHub Copilot

Hacker News 首頁上的第一名看到 GitHubOpenAI 合作推出了 GitHub Copilot,對應的討論可以在「GitHub Copilot: your AI pair programmer (」這邊看到。

GitHub Copilot 會猜測你接下來會想要寫的「完整片段」,像是這樣:

不過 Hacker News 上面的討論有參與 alpha 測試的人的評價,大概 1/10 機率會猜對,即使如此,他還是給了很多有用的資訊 (像是函式與變數的名稱):


I've been using the alpha for the past 2 weeks, and I'm blown away. Copilot guesses the exact code I want to write about one in ten times, and the rest of the time it suggests something rather good, or completely off. But when it guesses right, it feels like it's reading my mind.

It's really like pair programming, even though I'm coding alone. I have a better understanding of my own code, and I tend to give better names and descriptions to my methods. I write better code, documentation, and tests.

Copilot has made me a better programmer. No kidding. This is a huge achievement. Kudos to the GitHub Copilot team!

然後也有人笑稱總算找到理由寫 comment 了:


They finally did it. They finally found a way to make me write comments

反過來的另外一個大問題就是 copyright,這點在目前的問答集沒看到... 在 Hacker News 裡面的討論有提到這點,但目前沒有完整的定論。

目前只支援 VSCode,以後也許會有機會透過 LSP 支援其他的編輯器?

另外我想到 Kite 這個 machine learning 的 auto complete 工具,沒有那麼強大但也還不錯?

GitHub 對 Issues 增加了一些新功能

GitHub 推出了 Issues 的 beta program:「GitHub Issues · Project planning for developers」。

目前列出來的新功能裡,Board 與 Table 呈現方式 (Bored of boards? Switch to tables.),以及 Subticket 的功能 (Break issues into actionable tasks),這兩個算是在 project management 裡面很重要的功能,不過整體還是很陽春,只能說補上了一些重要的元素...

另外這次的 beta program 算是宣示 GitHub 有投入資源在改善 project management 這部份的功能,也許之後也許會有其他新功能繼續推出?

GitHub 支援 SSH 使用 Security Key 了

GitHub 宣佈支援使用 security key 的 SSH key 操作了:「Security keys are now supported for SSH Git operations」。

也就是需要 SSH key + security key 才有辦法認證,只有拿到 SSH key 或是 security key 都是沒有辦法認證過。

目前官方支援 ecdsa-sked25519-sk

Now you can use two additional key types: ecdsa-sk and ed25519-sk, where the “sk” suffix is short for “security key.”

不過在 Ubuntu 20.04 下用預設的系統只能支援 ecdsa-sk,因為 ed25519-sk 會遇到類似「ed25519 problem with libressl」這邊的問題,就算你用的是 OpenSSL

然後生完 key 後在 ~/.ssh/config 裡面指定對 使用這把 key:

    IdentityFile ~/.ssh/id_ecdsa_sk

接下來操作的時候就會需要碰一下 security key 了。

GitHub 宣佈在 上抵制 FLoC

GitHub 的公告簡單明瞭,也不用你操作,直接在 上抵制 FLoC:「GitHub Pages: Permissions-Policy: interest-cohort=() Header added to all pages sites」,在「[Feature request] Set HTTP header to opt out of FLoC in GitHub Pages」這邊有些討論,另外在 Hacker News 上的討論也可以看一下:「GitHub blocks FLoC across all of GitHub Pages (」。

不過不確定為什麼 custom domain 的就不加上去,可能微軟內部的法務團隊討論出來的結果?

All GitHub Pages sites served from the domain will now have a Permissions-Policy: interest-cohort=() header set.

Pages sites using a custom domain will not be impacted.

GitHub 的 API Token 換格式

GitHub 前幾天宣佈更換 API token 的格式:「Authentication token format updates are generally available」,在今年三月初的時候有先公告要換:「Authentication token format updates」。

另外昨天也解釋了換成這樣的優點:「Behind GitHub’s new authentication token formats」。

首先是 token 的字元集合變大了:

The character set changed from [a-f0-9] to [A-Za-z0-9_]

另外是增加了 prefix 直接指出是什麼種類的 token:

The format now includes a prefix for each token type:

  • ghp_ for Personal Access Tokens
  • gho_ for OAuth Access tokens
  • ghu_ for GitHub App user-to-server tokens
  • ghs_ for GitHub App server-to-server tokens
  • ghr_ for GitHub App refresh tokens

另外官方目前先不會改變 token 長度 (透過字元變多增加 entropy),但未來有打算要增加:

The length of our tokens is remaining the same for now. However, GitHub tokens will likely increase in length in future updates, so integrators should plan to support tokens up to 255 characters after June 1, 2021.

看起來當初當作 hex string 而轉成 binary 會有問題,不過就算這樣做應該也是轉的回來的。

回到好處的部份,這個作法跟 SlackStripe 類似,讓開發者或是管理者更容易辨識 token 的類型:

As we see across the industry from companies like Slack and Stripe, token prefixes are a clear way to make tokens identifiable. We are including specific 3 letter prefixes to represent each token, starting with a company signifier, gh, and the first letter of the token type.

另外這也讓 secret scanning 的準確度更高,本來是 40 bytes 的 hex string,有機會撞到程式碼內的 SHA-1 string:

Many of our old authentication token formats are hex-encoded 40 character strings that are indistinguishable from other encoded data like SHA hashes. These have several limitations, such as inefficient or even inaccurate detection of compromised tokens for our secret scanning feature.

另外官方也建議現有的 token 換成新的格式,這樣如果真的發生洩漏,可以透過 secret scanning 偵測並通知:

We strongly encourage you to reset any personal access tokens and OAuth tokens you have. These improvements help secret scanning detection and will help you mitigate any risk to compromised tokens.

PHP 的 Git Server 被打穿,決定把整個 Git 系統搬到 GitHub 上

就如同標題說的:「Changes to Git commit workflow」,Hacker News 上的討論也可以看一下:「PHP's Git server compromised, moving to GitHub (」。

Yesterday (2021-03-28) two malicious commits were pushed to the php-src repo [1] from the names of Rasmus Lerdorf and myself. We don't yet know how exactly this happened, but everything points towards a compromise of the server (rather than a compromise of an individual git account).

While investigation is still underway, we have decided that maintaining our
own git infrastructure is an unnecessary security risk, and that we will
discontinue the server. Instead, the repositories on GitHub,
which were previously only mirrors, will become canonical. This means that
changes should be pushed directly to GitHub rather than to


Eventbrite 的 MySQL 升級計畫

在 2021 年看到 EventbiteMySQL 升級計畫:「MySQL High Availability at Eventbrite」。

看起來是 2019 年年初的時候 MySQL 5.1 出問題,後續決定安排升級,在 2019 年年中把系統升級到 MySQL 5.7 (Percona Server 版本):

Our first major hurdle was to get current with our version of MySQL. In July, 2019 we completed the MySQL 5.1 to MySQL 5.7 (v5.7.19-17-log Percona Server to be precise) upgrade across all MySQL instances.

然後看起來是直接在 EC2 上跑,不過這邊提到的空間問題就不太確定了,是真的把 EBS 的空間上限用完嗎?比較常使用的 gp2gp3 上限都是 16TB,不確定是不是真的用到接近爆掉了:

Not only was support for MySQL 5.1 at End-of-Life (more than 5 years ago) but our MySQL 5.1 instances on EC2/AWS had limited storage and we were scheduled to run out of space at the end of July. Our backs were up against the wall and we had to deliver!

另外在升級到 5.7 的時候,順便把本來是 INT 的 primary key 都換成 BIGINT

As part of the cut-over to MySQL 5.7, we also took the opportunity to bake in a number of improvements. We converted all primary key columns from INT to BIGINT to prevent hitting MAX value.

然後系統因為舊版的 Django 沒辦法配合 MySQL 5.7,得升級到 Django 1.6 (要注意 Django 1 系列的最新版是 1.11,看起來光是升級到 1.6 勉強會動就升不上去了?):

In parallel with the MySQL 5.7 upgrade we also Upgraded Django to 1.6 due a behavioral change in MySQL 5.7 related to how transactions/commits were handled for SELECT statements. This behavior change was resulting in errors with older version of Python/Django running on MySQL 5.7

然後採用了 GitHub 家研發的 gh-ost 當作改變 schema 的工具:

In December 2019, the Eventbrite DBRE successfully implemented a table ALTER via gh-ost on one of our larger MySQL tables.

看起來主要的原因是有遇到 pt-online-schema-change 的限制 (在「GitHub 發展出來的 ALTER TABLE 方式」這邊有提到):

Eventbrite had traditionally used pt-online-schema-change (pt-osc) to ALTER MySQL tables in production. pt-osc uses MySQL triggers to move data from the original to the “duplicate” table which is a very expensive operation and can cause replication lag. Matter of fact, it had directly resulted in several outages in H1 of 2019 due to replication lag or breakage.

另外一個引入的技術是 Orchestrator,看起來是先跟 HAProxy 搭配,不過他們打算要再換到 ProxySQL

Next on the list was implementing improvements to MySQL high availability and automatic failover using Orchestrator. In February of 2020 we implemented a new HAProxy layer in front of all DB clusters and we released Orchestrator to production!

Orchestrator can successfully detect the primary failure and promote a new primary. The goal was to implement Orchestrator with HAProxy first and then eventually move to Orchestrator with ProxySQL.

然後最後題到了 Square 研發的 Shift,把 gh-ost 包裝起來變成有個 web UI 可以操作:

2021 還可以看到這類文章還蠻有趣的...