Internet Archive 收錄早期的 Macintosh 程式

Internet Archive 收錄了早期的 Macintosh 程式:「Early Macintosh Emulation Comes to the Archive」,從 1984 到 1989 的版本:

The first set of emulated Macintosh software is located in this collection. This is a curated presentation of applications, games, and operating systems from 1984-1989.

以現在來看好小 XDDD

While it is a (warning) 40 megabyte download, this compilation of System 7.0.1 includes a large variety of software programs and a rather rich recreation of the MacOS experience of 1991.

UC Berkeley Course 的影片將從 YouTube 上下架

看到「Ask HN: Which Berkeley Courses Should I Archive?」這篇才知道下架的計畫,但也有人很努力在掃:「UC Berkeley Course Captures」。

官方的公告在月初的時候發出來的:「Campus message on Course Capture video, podcast changes」,提供的理由還是很怪...

有點可惜啊... :o

Twitter 的歷史資料企業方案

Twitter 宣佈可以搜尋所有公開的 tweet 了:「Instant and complete access to every historical public Tweet」。

This new product builds off of our existing 30-Day search solution and extends the available window of instant and complete Twitter access to a span of more than nine years… and counting.

提供給 Gnip 的客戶搜尋:(這家公司去年被 Twitter 買下,參考「Twitter buys social data provider Gnip, stock soars」)

The Full-Archive Search API will now allow Gnip customers to immediately search for any historical public Tweet — ever.

看起來是個半獨家生意:

For more technical information about the Full Archive Search API, you can read our support documentation, and contact the Twitter Data Sales team at data-sales@twitter.com to learn how your business can start using this new historical API today.

網路黑市的歷史資料

在「Black-market archives」這篇給出了一份很寶貴的資料,是來自於 Tor hidden service 上的 Dark Net Markets (DNM)。

這份資料涵蓋了 2013 到 2015 年的各種紀錄:

From 2013-2015, I scraped/mirrored on a weekly or daily basis all existing English-language DNMs as part of my research into their usage, lifetimes/characteristics, & legal riskiness; these scrapes covered vendor pages, feedback, images, etc.

大約壓縮後 50GB 的資料:

This uniquely comprehensive collection is now publicly released as a 50GB (~1.6TB) collection covering 89 DNMs & 37+ related forums, representing <4,438 mirrors, and is available for any research.

Tor 的 hidden service 應該只會愈來愈流行,初期的這些資料會讓後人有很多題材可以分析...

Internet Archive 建築物發生火災...

在「Internet Archive's San Francisco Home Badly Damaged By Fire」看到 Internet Archive 發生火災,官方也有公告出來了:「Fire Update: Lost Many Cameras, 20 Boxes, and No People」。

沒有人受傷,服務也都正常。發生火災的地點並不是辦公室,而是掃描中心 (scanning center),依照敘述,應該是將類比資料 (紙本、相簿之類的) 掃成電子格式的場所。

HTTP Archive:網站速度的歷史紀錄

Steve Souders 公開新網站 HTTP Archive。類似於 Internet Archive 紀錄網站的每個時間點的樣子,HTTP Archive 紀錄網站每個時間點的 HTTP 效能:「Announcing the HTTP Archive」。

目前蒐集了 ~17000 個網站,每個網站約每兩個禮拜分析一次,選擇的網站來自 Alexa TopsitesFortune 500 (2010)、Quantcast。翻了自家的站台,看起來是從 2010 年 11 月就開始跑...

也因為紀錄了很多網站,所以也有些有趣的數據可以在「Interesting stats」這邊看到 (有日期可以選)。