關於 Hacker News 上面,「假設 CPU 速度上限只有現在的 1/20」的討論

算是 Hacker News 上面的閒聊文章,如果 CPU 只有現在 1/20 的速度的話,軟體開發會變成什麼樣子:「How might software development have unfolded if CPU speeds were 20x slower?」。

其實也沒那麼難想像,如果是拿 CPU 頻率來算 1/20 的話,上限大約是 250~300MHz?這大概是 Pentium II 的年代,1997 年的 CPU,當年主流的作業系統應該是 Windows 95...

裡面有很多討論,不過在 id=39977430 這邊看到:

No electron apps.

幹。

Google Groups 脫離 Usenet 系統

先前在「Google Groups 將在 2024/02/22 斷開與 Usenet 的連接」這邊提到的,Google Groups 在 2024/02/22 會斷開與 Usenet 的 NNTP peering,日子到了...

在 Google Groups 上官方更新 banner 訊息了:

從我自己架的 News Server 上也可以從 innreport 中 incoming feed 的各個指標看到差異了:

可以再觀察幾個月看看後續的量,原先使用 Google Groups 的人會跑來 Usenet 上面嗎?或是 Usenet 就繼續萎縮下去...?

看 Hacker News 上面文章變化的網站:Quality News

在『蒐集 Hacker News 上被不合理條件「消失」的連結』這篇看到的網站,定時去抓 Hacker News 上新的連結與首頁上的連結後整理出來的網站:「Quality News」。

像是現在在 top 1 的「John Walker, founder of Autodesk, has died (fourmilab.ch)」這篇,對應的資訊頁就在「Hacker News Story Stats: John Walker, founder of Autodesk, has died」這邊,抓個截圖:

這樣有些文章覺得奇怪時可以交叉看一下資料,就會對 Hacker News 上面的「操作」有些「感覺」了...

蒐集 Hacker News 上被不合理條件「消失」的連結

在「Stories removed from the Hacker News Front Page, updated in real time (github.com/vitoplantamura)」這邊看到的計畫,專案在 GitHub 上面:「vitoplantamura/HackerNewsRemovals」。

蒐集的條件是假設如果本來在第一頁上面 (top 30),不應該下一分鐘就調出 top 90:

The assumption is that a Story cannot go from the top 30 to a position higher than 90 in a single minute, without having been explicitly removed.

dangid=39231821 有解釋有些情況是會直接被 downrank 出去的:

The first two you listed were downranked by the flamewar detector. The last one was downranked by users. Admins didn't touch any of them.

其實 Hacker News 混久了大家心裡也都有底,這兩個很 buggy 的機制就是要夠 buggy 才能操弄,這點從某些特定主題 (像是 climate change 相關的) 會突然消失,然後被推給這兩個系統...

所以 Hacker News 被說黑箱也不是一天兩天了,隔壁 Lobsters 至少有 Moderation Log 可以看...

現在這樣可以撈出其他有用的連結來看了,另外應該也可以讓他變成類似 Hacker News Daily 的方法才對,每天整理出來後變成一篇 blog post,這樣可以訂起來看,來提議給作者看看好了?

Hacker News 目前搬上 Cloudflare

Hacker News 前陣子似乎是因為被打而轉移到 Cloudflare 上:

;; ANSWER SECTION:
news.ycombinator.com.   1       IN      CNAME   news.ycombinator.com.cdn.cloudflare.net.
news.ycombinator.com.cdn.cloudflare.net. 300 IN A 172.67.5.232
news.ycombinator.com.cdn.cloudflare.net. 300 IN A 104.22.6.236
news.ycombinator.com.cdn.cloudflare.net. 300 IN A 104.22.7.236

另外可以參考「Site report for http://news.ycombinator.com」這邊的記錄,我另外備份一份放在 archive.today 上:「Site report for http://news.ycombinator.com」。

可以看到先前是指到 209.216.230.240,然後今年 (2024) 年初的時候似乎是因為 DDoS 的關係,在骨幹上被 blackhole 掉,後來就轉到 Cloudflare 上了。

一個比較意外的是看到報告說是 FreeBSD,等之後切回去再來研究看看?

Cloudflare 的 WAF 在科技類的網站容易誤判,像是「Ask HN: Does Cloudflare block HN comments if you have code blocks in a reply?」這邊就遇到了,可以在留言裡面看到大家在研究要怎麼繞 WAF XDDD

等到切回去應該就會恢復,不過還在 Cloudflare 上的這陣子應該會繼續看到抱怨...

Usenet 的回春?

看到「Usenet, the OG social network, rises again like a text-only phoenix (theregister.com)」這個討論,原文「USENET, the OG social network, rises again like a text-only phoenix」這篇標題講 Usenet 的回春?

我是覺得 Usenet 要真的回春一定有困難... 但有些客群跑到上面倒是不太意外。

主要是文末提到這幾個 newsgroup 好像可以去看看:

As a big science fiction reader, this vulture enjoys dipping into rec.arts.sf.written and rec.arts.sf.fandom. The computer history group alt.folklore.computers is still pretty busy. There is life in several retrocomputing channels, and we've been enjoying talking about Acorn RISC OS and Fortran among other things.

我自己是因為興趣,所以搞了一個 news server 跑 (在 newsfeed.hasname.com 這邊),然後去接了幾個 peer,架了一個 BBS site 抓一些群組,像是 comp.lang.c 這種很經典的群... 但這也是自己弄起來玩玩而已。

Anyway,也許晚點去看看上面提到的群?

用 ChatGPT 評新聞文章分數後產生的 News Minimalist

看到「News Minimalist – Only significant news (newsminimalist.com)」這個,把新聞網站所有的文章都丟進 ChatGPT 裡面評分,然後過濾出最高分的產生 newsletter:「News Minimalist」。

It uses AI (ChatGPT-4) to read the top 1000 news every day and rank them by significance on a scale from 0 to 10 based on event magnitude, scale, potential, and source credibility.

然後有 feed 可以讀,所以除了用 RSS reader 讀以外,也可以用 /feed add https://rss.beehiiv.com/feeds/4aF2pGVAEN.xml 的方式掛進 Slack 裡面。

整個服務看起來是 Vercel + beehiiv 搭起來的。

前陣子 Hacker News 很慢的一些背景知識

看到 Ask HN: Is Hacker News slow for anyone else? 這邊的討論,dang (Hacker News 的管理員) 在 35157344 這邊就有出來說明:

All: our poor server is smoking today* so I've had to reduce the page size of comments. There are 1500+ comments in this thread but if you want to read more than a few dozen you'll need to page through them by clicking the More link at the bottom. I apologize!

Also, if you're cool with read-only access, just log out (edit: or use an incognito tab) and all will be fast again.

* yes, HN still runs on one core, at least the part that serves logged-in requests, and yes this will all get better someday...it kills me that this isn't done yet but one day you will all see

另外比較特別的是,Hacker News 是用 Arc (Lisp) 寫的,不過看起來沒有考慮到 optimization,加上那天 Reddit 也掛了,的確帶動 Hacker News 這邊更新的頻率比較高...

用 GPT-3 產生 Hacker News 上熱門文章的摘要

看到「Autosummarized HN」這個工具,算是一個組合技的應用:

All summaries have been generated automatically by GPT-3. No responsibility is claimed for their contents nor its accuracy.

透過 GPT-3 解讀並產生出摘要,目前頁面上是沒有 RSS feed,但可以透過一些工具直接拉出來 (像是 PolitePol),然後就可以掛到 Slack 或是 RSS reader 裡面...

Hacker News 前幾天炸很久的 root cause

前幾天 Hacker News 炸了很久,如果是從 Twitter 上的資料來看,是從 2022/07/08 14:08 UTC 這篇:

中間還原失敗 (2022/07/08 17:35 UTC):

到最後恢復 (2022/07/08 20:48 UTC):

Twitter 這邊的資料看起來差不多是六個小時多,以一個應該是只有 database 需要還原的站台來說的確是蠻久的,所以後續在「HN is up again」這邊就有在討論原因,裡面 HN 的老大 dang 也有提到 downtime 是七個小時多:

8 hours of downtime, but not data loss, since there was no data to lose during the downtime.

Last post before we went down (2022-07-08 12:46:04 UTC): https://news.ycombinator.com/item?id=32026565

First post once we were back up (2022-07-08 20:30:55 UTC): https://news.ycombinator.com/item?id=32026571 (hey, that's this thread! how'd you do that, tpmx?)

So, 7h 45m of downtime. What we don't know is how many posts (or votes, etc.) happened after our last backup, and were therefore lost. The latest vote we have was at 2022-07-08 12:46:05 UTC, which is about the same as the last post.

There can't be many lost posts or votes, though, because I checked HN Search (https://hn.algolia.com/) just before we brought HN back up, and their most recent comment and story were behind ours. That means our last backup on the ill-fated server was taken after the last API update (HN Search relies on our API), and the API gets updated every 30 seconds.

I'm not saying that's a rock-solid argument, but it suggests that 30 seconds is an upper bound on how much data we lost.

另外大家就在找 dang 的回應是什麼 (畢竟是第一手資料),用 Ctrl-F 找一下就看到有趣的猜測,從 32028511 這個節點可以看到這串有趣的討論,首先是 mikeiem

You are never going to guess how long the HN SSDs were in the servers... never ever... OK... I'll tell you: 4.5years. I am not even kidding.

然後是 kabdib 的回應:

Let me narrow my guess: They hit 4 years, 206 days and 16 hours . . . or 40,000 hours.

And that they were sold by HP or Dell, and manufactured by SanDisk.

Do I win a prize?

(None of us win prizes on this one).

接著就是 dang 說他覺得這個猜測很有可能:

Wow. It's possible that you have nailed this.

Edit: here's why I like this theory. I don't believe that the two disks had similar levels of wear, because the primary server would get more writes than the standby, and we switched between the two so rarely. The idea that they would have failed within hours of each other because of wear doesn't seem plausible.

But the two servers were set up at the same time, and it's possible that the two SSDs had been manufactured around the same time (same make and model). The idea that they hit the 40,000 hour mark within a few hours of each other seems entirely plausible.

Mike of M5 (mikiem in this thread) told us today that it "smelled like a timing issue" to him, and that is squarely in this territory.

後續他也從自家的 /newest 裡面撈了相關的資料出來,依照他撈出來的關鍵字,看起來是用 HPE 出的 SSD:

It's also an example of the dharma of /newest – the rising and falling away of stories that get no attention:

HPE releases urgent fix to stop enterprise SSDs conking out at 40K hours - https://news.ycombinator.com/item?id=22706968 - March 2020 (0 comments)

HPE SSD flaw will brick hardware after 40k hours - https://news.ycombinator.com/item?id=22697758 - March 2020 (0 comments)

Some HP Enterprise SSD will brick after 40000 hours without update - https://news.ycombinator.com/item?id=22697001 - March 2020 (1 comment)

HPE Warns of New Firmware Flaw That Bricks SSDs After 40k Hours of Use - https://news.ycombinator.com/item?id=22692611 - March 2020 (0 comments)

HPE Warns of New Bug That Kills SSD Drives After 40k Hours - https://news.ycombinator.com/item?id=22680420 - March 2020 (0 comments)

(there's also https://news.ycombinator.com/item?id=32035934, but that was submitted today)

這次 downtime 看起來很像是中了 SSD firmware bug,目前看起來先搬到 EC2 上面了:

$ host news.ycombinator.com
news.ycombinator.com has address 50.112.136.166
$ host 50.112.136.166      
166.136.112.50.in-addr.arpa domain name pointer ec2-50-112-136-166.us-west-2.compute.amazonaws.com.

看討論串應該是暫時性的?