Slashdot 名字的由來

在 20 年紀念文 (懷舊文) 中 Rob Malda 提到了 Slashdot 名字的由來:「A Pre-History of Slashdot」。

I originally used the name ‘slashdot’ on my desktop a year earlier when I got my first static IP in the Voorhees Hall dorm room I shared with Dave. Back in 1996, our floor was the first in all of Hope College to be granted 24/7 high speed internet access.

宿舍機器的名字 XDDD

Million Dollar Homepage 上的網站存活率

作者在分析 2005 年炒熱話題的 Million Dollar Homepage,上面所列的網站的存活比率:「A Million Squandered: The “Million Dollar Homepage” as a Decaying Digital Artifact」。

這是作者在 2017 年抓的截圖:

而這是分析圖:

作者跑程式分析,其中大約一半的 pixel 已經失效:

The 547 unreachable links are attached to graphical elements that collectively take up 342,000 pixels (face value: $342,000). Redirects account for a further 145,000 pixels (face value: $145,000).

不過如果以網站數量來看,則大約還有 63% 活著:

Of the 2,816 links that embedded on the page (accounting for a total of 999,400 pixels), 547 are entirely unreachable at this time. A further 489 redirect to a different domain or to a domain resale portal, leaving 1,780 reachable links.

突然有種「銀河的歷史又翻過了一頁」的感覺 XDDD

Reddit 的 Deploy 機制 (的歷史)

Reddit 主要是用 Python 寫的,這邊介紹了他們歷年來的 Code Deploy 系統:「The Evolution of Code Deploys at Reddit」。

最早期的時候 (2007 到 2010) 是用 rsync 更新程式碼,然後跑個迴圈用 ssh 連進去重跑:

# build the static files and put them on the static server
`make -C /home/reddit/reddit static`
`rsync /home/reddit/reddit/static public:/var/www/`

# iterate through the app servers and update their copy
# of the code, restarting once done.
foreach $h (@hostlist) {
    `git push $h:/home/reddit/reddit master`
    `ssh $h make -C /home/reddit/reddit`
    `ssh $h /bin/restart-reddit.sh`
}

2011 時因為人變多了,用 IRC 把過程丟出來 (okay,我知道你想問的問題... Slack 是 2013 年推出的):

The process for actually doing the deploy looked the same, but now the system did the work for you and told everyone what you were doing.

另外值得一提的是,因為他們不是自己架 IRC server 而是用外面第三方的伺服器,所以他們決定 IRC 只有單向告知的功能:

There was a lot of talk of systems that managed deploys from chat around this time, but since we used third party IRC servers we weren’t able to fully trust the chat room with production control and so it remained a one-way flow of information.

2012 時則是把機器列表放到 DNS 上,某種 service discovery 系統:

First, it fetched its list of hosts from DNS rather than keeping it hard-coded. This allowed us to update the list of hosts without having to remember to update the deploy tool as well — a rudimentary service discovery system.

另外是固定的版本,而非拉 master 下來,這樣可以避免 race condition 的不一致性 (推到一半有人把 code 塞進 master):

Another small but important change was to always deploy a fixed version of the code. The previous version of the tool would update master on a given host, but what if master changed mid-deploy because someone accidentally pushed up code? By deploying a specific git revision instead of branch name, we ensured that the deploy got the same version everywhere in production.

2013 往雲上搬,於是遇到像是「開新機器時剛好在 deploy 會拉到舊的 code」這種 edge case。

What happens if a server is launched while a deploy is ongoing? We had to make sure each newly launched server checked in to get new code if present. What about servers going away mid-deploy? The tool had to be made smarter to detect when the server was gone legitimately rather than there being an issue with the deploy process itself that should be noisily alerted on.

2014 遇到機器數量太多,推一輪要一個小時而被迫要平行化處理:

Over time, the number of servers needed to serve peak traffic grew. This meant that deploys took longer and longer. At its worst, a normal deploy took close to an hour. This was not good.

2015 則是加上 deploy lock,避免同時間有兩個人在 deploy:

Engineers would ask for the deploy lock and either get it or get put in the queue. This helped keep order in deploys and let people relax a bit while waiting for the lock.

2017 的部份則是提到了伺服器的數量:

This new mechanism allows us to deploy to a lot more machines concurrently, and deploy timings are down to 7 minutes for around 800 servers despite the extra waiting for safety.

看起來到現在還是維持手動 deploy,而不是自動化... 這塊還蠻有趣的 :o

SSL/TLS 以及 PKI 的歷史 (加上各種風風雨雨)

Twitter 上看到 Let's Encrypt 轉了這則講 SSL/TLS 與 PKI 的時間線:「SSL/TLS and PKI History」。

這幾年的資料比較完整,看著這些時間線剛好可以拿來複習一下。

而 2013 年 Snowden 的事情也被放進去了,這使得這三年各種 SSL/TLS 化的進展急劇加速 (包括各種 HTTPS 的進展,甚至是郵件的 STARTTLS 加密等等),也因此推動了像是 Let's Encrypt 這樣更方便提供 SSL/TLS certificate 的組織成立。

麻州立法禁止詢問前一份工作的薪資

雖然利用談判技巧是可以避開 (在你有本錢談判的情況下),麻州直接立法禁止了,這對於求職者來說相當重要:「Illegal in Massachusetts: Asking Your Salary in a Job Interview」。

The new law will require hiring managers to state a compensation figure upfront — based on what an applicant’s worth is to the company, rather than on what he or she made in a previous position.

法案是「Bill S.2119」,可以看到「An Act to establish pay equity」的說明,應該是指目標之類的。

裡面的幾個重點,首先是生效日期:

SECTION 7. This act shall take effect on January 1, 2018.

然後是求職期間的禁止行為:

(3) seek the salary history of any prospective employee from any current or former employer; provided, however, that a prospective employee may provide written authorization to a prospective employer to confirm prior wages, including benefits or other compensation or salary history only after any offer of employment with compensation has been made to the prospective employee;

接下來應該會有更多州制定類似的條款...

利用 HSTS 資訊得知網站紀錄的 sniffly

看到「sniffly」這個工具,可以利用 HSTS 資訊檢測逛過哪些網站,程式碼在「diracdeltas/sniffly」這邊可以找到:

Sniffly is an attack that abuses HTTP Strict Transport Security and Content Security Policy to allow arbitrary websites to sniff a user's browsing history. It has been tested in Firefox and Chrome.

測試網站則可以在這邊看到,作者拿 Alexa 上的資料網站來掃,所以熱門網站應該都會被放進去...

主要是利用 HSTS + CSP policy 的 timing attack (有逛過網站而瀏覽器裡有 HSTS 時的 redirect 會比較快,沒有逛過的時候會因為有網路連線而比較慢):

Sniffly sets a CSP policy that restricts images to HTTP, so image sources are blocked before they are redirected to HTTPS. This is crucial! If the browser completes a request to the HTTPS site, then it will receive the HSTS pin, and the attack will no longer work when the user visits Sniffly.

When an image gets blocked by CSP, its onerror handler is called. In this case, the onerror handler does some fancy tricks to time how long it took for the image to be redirected from HTTP to HTTPS. If this time is on the order of a millisecond, it was an HSTS redirect (no network request was made), which means the user has visited the image's domain before. If it's on the order of 100 milliseconds, then a network request probably occurred, meaning that the user hasn't visited the image's domain.

由於這個技巧,HTTPS Everywhere 必須關閉才會比較準確。

Facebook 更新 iOS 應用程式,修正吃電問題

在「在 iOS 上不使用 Facebook App 時要完全砍掉 process」這邊提到了 Facebook 在 iOS 版的應用程式會在背景播放無聲音樂,導致吃電特別兇的問題,Facebook 的 Ari Grant 出來澄清是 bug 造成的,而非故意行為。

修正了兩個 bug,第一個是 network code 的部分:

The first issue we found was a “CPU spin” in our network code. A CPU spin is like a child in a car asking, “Are we there yet? Are we there yet? Are we there yet?”with the question not resulting in any progress to reaching the destination. This repeated processing causes our app to use more battery than intended. The version released today has some improvements that should start making this better.

第二個則是之前提到無聲 audio 的問題:

The second issue is with how we manage audio sessions. If you leave the Facebook app after watching a video, the audio session sometimes stays open as if the app was playing audio silently. This is similar to when you close a music app and want to keep listening to the music while you do other things, except in this case it was unintentional and nothing kept playing. The app isn't actually doing anything while awake in the background, but it does use more battery simply by being awake. Our fixes will solve this audio issue and remove background audio completely.

同時澄清並沒有要在背景更新取得地理位置資訊:

The issues we have found are not caused by the optional Location History feature in the Facebook app or anything related to location. If you haven't opted into this feature by setting Location Access to Always and enabling Location History inside the app, then we aren't accessing your device's location in the background. The issues described above don't change this at all.

理論上新版應該會省一點電了?

EC2 Instance 的發展紀錄

Jeff Barr 被問到有沒有 Amazon EC2 每種 instance 的發表時間,結果連他手上都沒有完整的資料... 這是他目前整理出來的資訊:「EC2 Instance History」。

Update:畢竟還是 AWS 的高層,應該是從內部抓出資料後再去找什麼時候公佈的,目前的表格補齊了。

其中 m1.small 在的 2006 年八月是歷史性的一刻,Amazon EC2 的發表,雲端服務的里程碑:「Amazon EC2 Beta」。