維基百科的 Vital articles

Hacker News Daily 這邊看到,英文版維基百科有一套列表,整理出「重要」的條目:「Wikipedia:Vital articles」。

目前的列表有五個層級,從 Level 1 到 Level 5,後面的 Level 包含了前面 Level 的文章:

  • Level 1 只有 10 篇。
  • Level 2 有 100 篇 (包含 Level 1 的 10 篇,以下類推)。
  • Level 3 有 1000 篇。
  • Level 4 有 10000 篇。
  • Level 5 有 50000 篇。

看到的第一個問題就是這些列表怎麼產生的,這點在 Wikipedia talk:Vital articles/Frequently Asked Questions 裡面有提到列表的歷史:這是 2004 年由 David Gerard 發起,之後擴大到社群並且分不同等級。而這也說明了這些列表示人工選擇的,而不是透過演算法推薦的:

The English Wikipedia Vital Articles list was originally created in August 2004 by David Gerard as an adaptation of the metawiki List of articles every Wikipedia should have. Since then, the Vital Articles list has undergone numerous revisions by multiple editors, and has expanded to include 5 different levels of vitalness.

然後選擇的標準是「要了解這個領域不可或缺的條目」:

A vital article is one considered essential to the subjects listed. For example, it would be difficult to discuss Science without the scientific method, History without World War II, Language without Grammar, Earth science without Geology, or Civics without Democracy. Individuals within the People section represent the pinnacles of their field, such as Albert Einstein in "Inventors and scientists" or William Shakespeare in "Authors". In sections such as those pertaining to People, History or Geography, weight is given to some articles to produce a more diverse, global list.

這些列表其中一種用法是「想要了解某個領域」,但剛剛翻了一下 Level 1 與 Level 2 可以發現似乎太少,看起來 Level 3 的資料算是個還不錯的起點...

GitHub 可以在 repository 上加 tag 了

功能叫做 topics

GitHub 會透過機器學習的方式對公開的 repository 給建議:

Additionally, GitHub uses machine learning to analyze public repository content and generate suggested topics that repository admins can accept or reject. Private repository content is not analyzed and does not receive topic suggestions.