Git 2.37.0 對巨大 Monorepo 的加速功能 FSMonitor

這邊用 GitHub 寫的說明好了:「Improve Git monorepo performance with a file system monitor」。

從 2.37.0 開始,Windows 與 Mac 版的使用者可以透過 FSMonitor 的功能記錄檔案系統的變化,大幅減少需要 scan 整個 repository 的時間,可以看到啟用後對於像是 chromium 這種大型專案的 status 時間就大幅下降了:

不過 Linux 還沒支援,目前我的環境都是 Linux,就沒辦法用了...

Git 的「災難處理」

但印象中之前看過 (在 Internet Archive 上可以看到 4 Sep 2017 的版本),但搜尋 Hacker News 後發現沒有提過... 這幾天紅起來的「Dangit, Git!?!」,也有簡體中文版可以看。

裡面其實提到了很多要怎麼處理不小心塞錯資料進 Git 的情況,不過好像還是有些東西沒涵蓋到,像是遇到不小心塞到 credentials 進去後需要清除掉的 git rebase -i HASH,接著一連串的手動修 conflict 與 git rebase --continue,最後再接上 git push --force 這種禁招...

另外推一下「為你自己學 Git」這本書,裡面其實也有提到類似的情境:

第7章:修改歷史紀錄
7.1 狀況題 修改歷史訊息
7.2 狀況題 把多個 Commit 合併成一個Commit
7.3 狀況題 把一個 Commit 拆解成多個Commit
7.4 狀況題 想要在某些 Commit 之間再加新的Commit
7.5 狀況題 想要刪除某幾個 Commit 或是調整Commit 的順序
7.6 Reset、Revert 跟 Rebase 指令有什麼差別?

這本書也有網頁版,在 gitbook.tw 這邊。

在 .gitignore 裡面忽略掉 .gitignore...

Hacker News 上看到「Git ignores .gitignore with .gitignore in .gitignore」這個搞事的功能,可以在 .gitignore 內把 .gitignore 忽略掉 XDDD

這真虧作者想的到這樣的玩法 XDDD

在 Hacker News 上也有看到一些有趣的東西,像是 globally ignore list 之類的:「Git ignores .gitignore with .gitignore in .gitignore (rubenerd.com)」。

用 GitHub Actions 做的監控服務 Upptime

是在 Twitter 上看到這個:

然後翻到 Upptime 這個 open source monitoring 工具,直接是用 GitHub Actions 提供的 schedule (cron job) 每五分鐘跑一次。這邊要注意的是,如果是 public repository 的話不受限制,如果是 private repository 的話會有機會把 quota 吃完:

Billing note: Upptime uses thousands of build minutes every month (approximately 3,000 minutes in the default setting). If you use a public repository, GitHub offers unlimited free build minutes, but if you use a private repository, you'll have to pay for this time.

依照說明是用 GitHub Actions、GitHub IssuesGitHub Pages 三個功能在運作:

  • GitHub Actions is used as an uptime monitor
  • GitHub Issues are used for incident reports
  • GitHub Pages are used for the status website

除了用這三個功能外,另外還是會每天塞一些資料回 git history 裡面:

We also record the response time once per day and commit it to git history. This way, we can graph long-term trends in your websites' response times by going through git commit history. We generate these graphs once every day, also using schedulers.

好像可以玩看看...

Python 的 Black

Hacker News 上看到 Black 這個幫你處理 Python 程式碼的工具:「Black, the uncompromising Python code formatter, is stable (pypi.org)」。

Black is the uncompromising Python code formatter. By using it, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting. You will save time and mental energy for more important matters.

然後從 Hacker News 上討論的情況看起來大家都覺得很不錯?好像可以看看能不能拿來用...

另外一個在討論的時候看到學到的東西,是 git blame --ignore-revs-file 這個功能,可以在 git blame 時濾掉某些 commit,剛好拿來過濾 reformatting commit:

Ignore revisions listed in file, which must be in the same format as an fsck.skipList. This option may be repeated, and these files will be processed after any files specified with the blame.ignoreRevsFile config option. An empty file name, "", will clear the list of revs from previously processed files.

Git 的 Diff 演算法

Lobsters Daily 上看到「When to Use Each of the Git Diff Algorithms」這篇 2020 的文章 (Lobsters 這邊常冒出一些舊文章),講 Git 的 diff 演算法。

文章裡面題到了四個演算法,看了一下 git-diff 的 manpage 也還是這四個:

--diff-algorithm={patience|minimal|histogram|myers}

另外看 manpage 時還有看到 --anchored 的用法,看起來是用來提示 Git 產生比較好懂的 diff 結果:

--anchored=<text>

Generate a diff using the "anchored diff" algorithm.

This option may be specified more than once.

If a line exists in both the source and destination, exists only once, and starts with this text, this algorithm attempts to prevent it from appearing as a deletion or addition in the output. It uses the "patience diff" algorithm internally.

回頭翻了一下自己的 config repository,看起來 2017 年年底的時候我就加了「Use histogram diff algorithm.」,然後過幾天換成「Use patience algorithm.」,就一直用到現在...

這次改用 minimal 跑一陣子看看好了...

用 AWK 寫的 Git

前幾天看到的東西,用 AWK 寫的 Git:「Aho: A Git implementation in AWK.」,Hacker News 上的討論「A Git Implementation in Awk (github.com/djanderson)」也可以翻一下。

Aho 這個名字取自 AWK 的作者之一 Alfred Aho (AWK 中的 A),然後查資料才發現他剛拿到去年 2020 的 Turing Award...

作者提到了為什麼會用 AWK 寫 Git,看起來就是個爽字 XDDD:

I've had the irrational desire to write something substantial in AWK for a while. Figured I might as well learn some Git internals while I scratch this itch.

然後他有提到他沒打算把網路相關的功能實做進去:

I don't plan to add network functionality to this (even though you totally can), so no clone or push.

是個有趣的專案,寫爽的 XD

InterNetNews (INN) 在測試 Git

在 mailing list 上看到「Test Git repository for INN」這個消息,看起來 InterNetNews 也在嘗試換到 Git 上了,目前是選擇用 GitHub,在「InterNetNews/inn」這邊。

整個從 CVS 換到 Subversion 再到現在開始進入 Git 的環境了。

沒有全部自己搞的理由也有提到,主要就是免費與方便,然後社群也已經知道 GitHub 是什麼了:

GitHub was chosen because it's (a) free (as in price), (b) widely understood and used already, and (c) easy for me to set up. I know some folks have reservations about GitHub because it's not free software, and I understand, but I don't think we're committing heavily to their platform (everyone who has a clone of the Git repository has all the important data and can move it elsewhere), and other things like GitLab are only open core anyway. Hosting everything on 100% free software (depending on the choice of free software) loses us some useful features (I plan to set up GitHub Actions for CI) and requires more resources that I don't really have to spend on it. That said, everyone is certainly welcome to mirror the Git repository elsewhere if they want.

之後會跑 CI,算是這些年軟體工程必備的工具了...

PHP 的 Git Server 被打穿,決定把整個 Git 系統搬到 GitHub 上

就如同標題說的:「Changes to Git commit workflow」,Hacker News 上的討論也可以看一下:「PHP's Git server compromised, moving to GitHub (php.net)」。

Yesterday (2021-03-28) two malicious commits were pushed to the php-src repo [1] from the names of Rasmus Lerdorf and myself. We don't yet know how exactly this happened, but everything points towards a compromise of the git.php.net server (rather than a compromise of an individual git account).

While investigation is still underway, we have decided that maintaining our
own git infrastructure is an unnecessary security risk, and that we will
discontinue the git.php.net server. Instead, the repositories on GitHub,
which were previously only mirrors, will become canonical. This means that
changes should be pushed directly to GitHub rather than to git.php.net.

不知道發生什麼事情,要等事後的報告出來...

Dolt,本機開發測試用的 MySQL server

看到「Dolt is Git for Data!」這個專案,是個在本機上跑的 MySQL server,另外可以在上面的資料進行版本控制,看起來很適合本機開發測試。

首先抓下來可以看到沒幾個檔案 (這是 linux-amd64 版),也可以看到跟 Git 的關係:

$ tree
.
├── bin
│   ├── dolt
│   ├── git-dolt
│   └── git-dolt-smudge
└── LICENSES

然後用 bin/dolt sql-server -P 3307 -u root -p passw0rd 跑就可以把一個相容於 MySQL 的伺服器跑在 port 3307,然後用 mysql -h 127.0.0.1 --port 3307 -u root -p 就可以輸入密碼 passw0rd 登入進去:

$ mysql -h 127.0.0.1 --port 3307 -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.7.9-Vitess

可以從 Server version 看到專案是用了 Vitess 實做的 MySQL 界面。

另外測了一下,透過連線所做的變更 (像是 CREATE DATABASECREATE TABLE,以及 CRUD 中的 CUD) 是不會寫回磁碟裡的,嘗試了不同的設定,不管改什麼都是這樣,應該是故意設計成這樣。

在本機跑 test case 測試應該還不錯,會比 SQLite:memory: 更接近 MySQL 一些,不過在 CI 裡的話應該是可以直接把 MySQL 跑起來...