用 GPT-3 產生 Hacker News 上熱門文章的摘要

看到「Autosummarized HN」這個工具,算是一個組合技的應用:

All summaries have been generated automatically by GPT-3. No responsibility is claimed for their contents nor its accuracy.

透過 GPT-3 解讀並產生出摘要,目前頁面上是沒有 RSS feed,但可以透過一些工具直接拉出來 (像是 PolitePol),然後就可以掛到 Slack 或是 RSS reader 裡面...

Hacker News 前幾天炸很久的 root cause

前幾天 Hacker News 炸了很久,如果是從 Twitter 上的資料來看,是從 2022/07/08 14:08 UTC 這篇:

中間還原失敗 (2022/07/08 17:35 UTC):

到最後恢復 (2022/07/08 20:48 UTC):

Twitter 這邊的資料看起來差不多是六個小時多,以一個應該是只有 database 需要還原的站台來說的確是蠻久的,所以後續在「HN is up again」這邊就有在討論原因,裡面 HN 的老大 dang 也有提到 downtime 是七個小時多:

8 hours of downtime, but not data loss, since there was no data to lose during the downtime.

Last post before we went down (2022-07-08 12:46:04 UTC): https://news.ycombinator.com/item?id=32026565

First post once we were back up (2022-07-08 20:30:55 UTC): https://news.ycombinator.com/item?id=32026571 (hey, that's this thread! how'd you do that, tpmx?)

So, 7h 45m of downtime. What we don't know is how many posts (or votes, etc.) happened after our last backup, and were therefore lost. The latest vote we have was at 2022-07-08 12:46:05 UTC, which is about the same as the last post.

There can't be many lost posts or votes, though, because I checked HN Search (https://hn.algolia.com/) just before we brought HN back up, and their most recent comment and story were behind ours. That means our last backup on the ill-fated server was taken after the last API update (HN Search relies on our API), and the API gets updated every 30 seconds.

I'm not saying that's a rock-solid argument, but it suggests that 30 seconds is an upper bound on how much data we lost.

另外大家就在找 dang 的回應是什麼 (畢竟是第一手資料),用 Ctrl-F 找一下就看到有趣的猜測,從 32028511 這個節點可以看到這串有趣的討論,首先是 mikeiem

You are never going to guess how long the HN SSDs were in the servers... never ever... OK... I'll tell you: 4.5years. I am not even kidding.

然後是 kabdib 的回應:

Let me narrow my guess: They hit 4 years, 206 days and 16 hours . . . or 40,000 hours.

And that they were sold by HP or Dell, and manufactured by SanDisk.

Do I win a prize?

(None of us win prizes on this one).

接著就是 dang 說他覺得這個猜測很有可能:

Wow. It's possible that you have nailed this.

Edit: here's why I like this theory. I don't believe that the two disks had similar levels of wear, because the primary server would get more writes than the standby, and we switched between the two so rarely. The idea that they would have failed within hours of each other because of wear doesn't seem plausible.

But the two servers were set up at the same time, and it's possible that the two SSDs had been manufactured around the same time (same make and model). The idea that they hit the 40,000 hour mark within a few hours of each other seems entirely plausible.

Mike of M5 (mikiem in this thread) told us today that it "smelled like a timing issue" to him, and that is squarely in this territory.

後續他也從自家的 /newest 裡面撈了相關的資料出來,依照他撈出來的關鍵字,看起來是用 HPE 出的 SSD:

It's also an example of the dharma of /newest – the rising and falling away of stories that get no attention:

HPE releases urgent fix to stop enterprise SSDs conking out at 40K hours - https://news.ycombinator.com/item?id=22706968 - March 2020 (0 comments)

HPE SSD flaw will brick hardware after 40k hours - https://news.ycombinator.com/item?id=22697758 - March 2020 (0 comments)

Some HP Enterprise SSD will brick after 40000 hours without update - https://news.ycombinator.com/item?id=22697001 - March 2020 (1 comment)

HPE Warns of New Firmware Flaw That Bricks SSDs After 40k Hours of Use - https://news.ycombinator.com/item?id=22692611 - March 2020 (0 comments)

HPE Warns of New Bug That Kills SSD Drives After 40k Hours - https://news.ycombinator.com/item?id=22680420 - March 2020 (0 comments)

(there's also https://news.ycombinator.com/item?id=32035934, but that was submitted today)

這次 downtime 看起來很像是中了 SSD firmware bug,目前看起來先搬到 EC2 上面了:

$ host news.ycombinator.com
news.ycombinator.com has address 50.112.136.166
$ host 50.112.136.166      
166.136.112.50.in-addr.arpa domain name pointer ec2-50-112-136-166.us-west-2.compute.amazonaws.com.

看討論串應該是暫時性的?

在 Hacker News 上看到幾個 Key-Value Store 軟體

Hacker News 上看到「Redis vs. KeyDB vs. Dragonfly vs. Skytable」這篇,裡面介紹了四套 key-value store 軟體:

  • Redis:這個應該不太需要介紹...
  • KeyDBSnapchat 搞出來的 Redis clone,主要的賣點是 multi-threading。
  • Dragonfly:宣稱地球上最快,但作者跑不出來,下面的討論有人提到 Dragonfly 在更多的 CPU 資源效能就會更好。
  • Skytable:作者測出來最快的。

裡面看起來都蠻有趣的,可以追起來看看發展的情況,但如果真的要的用的話,應該還是先以 Redis 為主,穩定度以及功能還是重點...

BBC 這次拿出短波廣播...

Hacker News Daily 上看到的,BBC 這次戰爭拿出短波廣播發送訊號,讓烏克蘭地區的人,以及一部分俄羅斯的人可以收到 BBC 的新聞:「BBC resurrects WWII-era shortwave broadcasts as Russia blocks news of Ukraine invasion」。

The BBC says its shortwave broadcasts will be available on frequencies of 15735 kHz from 4PM to 6PM and 5875 kHz from 10PM to midnight, Ukraine time. News will be read in English, which the BBC says will be available in Kyiv as well as “parts of Russia.”

主要還是用到短波廣播可以傳很遠,以及難以封鎖的特性,相較於 internet 容易被牆掉所以被拿來用...

另外 BBC 也提供了 Onion 的版本,讓俄羅斯的人可以翻出來看 BBC 的新聞:

The BBC’s current onion domain is: https://www.bbcnewsd73hkzno2ini43t4gblxvycyac5aw4gnv7t2rccijh7745uqd.onion.

不過有 Tor 的話基本上可以直接從 exit node 看,好像沒有必要特別用 Onion 位置...

新世紀福音戰士的字型

沒想到在 Hacker News 首頁上看到第一名居然是這個連結:

2019 年的文章:「Neon Genesis Evangelion」,找資料的時候發現有簡體中文版的翻譯:「末世感叩击:《新世纪福音战士》的文字世界」。

這些字型是由日本的 Fontworks 所開發出來的 Matisse EB,在片尾的 credit 也可以看到「株式会社フォントワークスジャパン」:

主要是沒想到會在 Hacker News 首頁上的第一名看到這個...

SendGrid 意外的被幹翻...

看到 Hacker News 上的「Ask HN: Great tools for solo SaaS founders?」這則,在討論有哪些服務好用的,有人提到了 SendGrid 做為 email 發送服務,結果沒想到下面一堆人幹翻 XDDD

蠻多人推薦 Postmark 的,另外有人提到 SparkPost

另外可以看一下「Hacker News Tools of the Trade」,之前要找工具都會往這邊翻翻...

Hacker News 拿到 hackernews.com 了

Hacker News 上看到「Hackernews.com (hackernews.com)」這則消息,有人注意到 hackernews.com 被指到 news.ycombinator.com...

一開始有人猜測只是第三方指過來:

LeoPanthera 1 day ago

With a different registrar to ycombinator.com, this is likely not owned by Y Combinator, and therefore difficult to trust that it won't start being malicious in the future.

不過後來 dang (Hacker News 的管理員) 有出來證實這個網域名稱目前是在他們旗下了:

dang 1 day ago

It's owned by YC now. We got it earlier this year. That's why it redirects to HN!

以 Hacker News 的性質來說不是太重要,算是有機會拿掉就順便拿下來...

分析 Hacker News 上的討論所給出的書單

看到「HackerNews Readings」這個站,上面說他分析了 Hacker News 上的討論,然後給出書單:

40,000 HackerNews book recommendations identified using NLP and deep learning

點進去目前的預設 category (All Categories) 第一名是 Thinking, Fast and Slow (快思慢想),左邊有拉出 Amazon.com 上的評分,右邊可以看到對這本書的評論。

想要拉一些書來看可以從這邊翻翻看...

多看兩個來源的整理:Ask Hacker News Weekly 與 Lobsters Daily

之前有訂起來的是「Hacker News Daily」,每天會整理 Hacker News 上的熱門資料。後來發現 Colin Percival 有針對 Ask 的部份另外整理出來,一週一次:「Ask Hacker News Weekly」。

另外是 Lobsters 這個站也還是有一些 active user,裡面的東西偏技術,然後也有人用程式整理出「Lobsters Daily」可以訂。

算是多一些東西可以看...

把中央社的 App 裝起來... (來推廣一下好了)

看到之前同事在 Twitter 上提到中央社的標題算是比較正常的,不是 clickbait (誘餌式標題) 走向,就裝起來收通知看一下國內的新聞:

到今天算是用了三四天,目前看到的標題都還不錯,從標題大概就可以抓到重點,不過好像沒看到付費的項目可以贊助 (也許可以提供 app 內去廣告?),寫一篇文章來推廣好了...

另外一個有付費的是 WSJ,USD$5/m,主要是桌機可以直接看內容,不過手機上也是讓 app 推播標題...