看 Hacker News 上面文章變化的網站:Quality News

在『蒐集 Hacker News 上被不合理條件「消失」的連結』這篇看到的網站,定時去抓 Hacker News 上新的連結與首頁上的連結後整理出來的網站:「Quality News」。

像是現在在 top 1 的「John Walker, founder of Autodesk, has died (fourmilab.ch)」這篇,對應的資訊頁就在「Hacker News Story Stats: John Walker, founder of Autodesk, has died」這邊,抓個截圖:

這樣有些文章覺得奇怪時可以交叉看一下資料,就會對 Hacker News 上面的「操作」有些「感覺」了...

搜尋引擎的替代方案清單

看到「A look at search engines with their own indexes」這篇在介紹各個搜尋引擎,作者設計了一套方法測試,另外在文章裡面也給了很多主觀的意見,算是很有參考價值的,可以試看看裡面提出來的建議。

另外在 Hacker News 上也有討論可以參考:「A look at search engines with their own indexes (2021) (seirdy.one)」。

在文章開頭的「General indexing search-engines」這個章節,先列出三大搜尋引擎 GBY (GoogleBingYandex),以及使用這三家當作後端資料庫的搜尋引擎,可以看到到處都是 Bing 的影子。

接著作者推薦 Mojeek 這個作為 GBY 的替代方案:

Mojeek: Seems privacy-oriented with a large index containing billions of pages. Quality isn’t at GBY’s level, but it’s not bad either. If I had to use Mojeek as my default general search engine, I’d live. Partially powers eTools.ch. At this moment, I think that Mojeek is the best alternative to GBY for general search.

在「Smaller indexes or less relevant results」這邊也有一些方案,像是這個章節第一個提到的 Right Dao,作者就給他了不錯的評價:

Right Dao: very fast, good results. Passes the tests fairly well. It plans on including query-based ads if/when its user base grows.

接下來的「Smaller indexes, hit-and-miss」與「Unusable engines, irrelevant results」也可以翻一下,看看作者怎麼批評 XD

然後是後面的「Semi-independent indexes」就出現了最近幾個比較有名的,像是 Brave Search 與目前我在用的 Kagi

整理的相當不錯...

看起來這個月 HiNet 連外大概會不怎麼順...

Twitter 上看到這則障礙資料:

APG (Asia Pacific Gateway) 在 2016 年啟用,還算是新的海纜,看起來會有不少頻寬受到影響... 這點在 HiNet 上用 SmokePing 監控對 dynamodb.ap-southeast-1.amazonaws.com 的 packet loss 就很明顯的可以看出來了:

連過去封包掉的亂七八糟的,然後公司做東南亞生意,操作起來苦哈哈... 不過其他 ISP 看起來還行,應該有機會先繞過去。

獨立遊戲創作者推出 Linux 版的好處

標題不知道怎麼下,大概就是這樣...

Hacker News 首頁上翻到的,以這個 upvote 數量來看,應該會收到今天的 Hacker News Daily 上:「Despite having just 5.8% sales, over 38% of bug reports come from Linux (reddit.com)」。

作者是一位獨立遊戲開發者,在兩年前推出了「ΔV: Rings of Saturn」這款遊戲,並且也發佈了 Linux 版。

作者先就數字來比較,他賣出了 12000 套,其中 700 套是 Linux 玩家;另外他收到了 1040 個 bug report,其中大約 400 個是從 Linux 玩家回報的:

As of today, I sold a little over 12,000 units of ΔV in total. 700 of these units were bought by Linux players. That’s 5.8%. I got 1040 bug reports in total, out of which roughly 400 are made by Linux players.

That’s one report per 11.5 users on average, and one report per 1.75 Linux players. That’s right, an average Linux player will get you 650% more bug reports.

看文章時可能會覺得「Linux 玩家真難伺候」,但實際上作者講到,這 400 個 bug 裡面只有 3 個 bug 是平台相關的 (只會發生在 Linux 上),其他的 bug 其實是所有平台都會發生:

A lot of extra work for just 5.8% of extra units, right?

Wrong. Bugs exist whenever you know about them, or not.

Do you know how many of these 400 bug reports were actually platform-specific? 3. Literally only 3 things were problems that came out just on Linux. The rest of them were affecting everyone[.]

原因是 Linux 社群在參與各種 open source project 養成的習慣,會盡可能把各種資訊講清楚,並且找出可以重製問題的方式:

The thing is, the Linux community is exceptionally well trained in reporting bugs. That is just the open-source way. This 5.8% of players found 38% of all the bugs that affected everyone. Just like having your own 700-person strong QA team. That was not 38% extra work for me, that was just free QA!

But that’s not all. The report quality is stellar.

與一般玩家的回報方式完全不同,Linux 玩家很習慣就會附上基本的環境資訊,系統紀錄,甚至有時候還會包括 core dump 與 reproducible steps:

I mean we have all seen bug reports like: “it crashes for me after a few hours”. Do you know what a developer can do with such a report? Feel sorry at best. You can’t really fix any bug unless you can replicate it, see it with your own eyes, peek inside and finally see that it’s fixed.

And with bug reports from Linux players is just something else. You get all the software/os versions, all the logs, you get core dumps and you get replication steps. Sometimes I got with the player over discord and we quickly iterated a few versions with progressive fixes to isolate the problem. You just don’t get that kind of engagement from anyone else.

不知道有沒有遇到回報 GDB 資訊的...

很特別的分享 XDDD

Google 新推出的 Lyra audio codec

Hacker News Daily 上看到「Lyra audio codec enables high-quality voice calls at 3 kbps bitrate」,講 Google 新推出的 Lyra audio codec:「Lyra: A New Very Low-Bitrate Codec for Speech Compression」,論文在「Generative Speech Coding with Predictive Variance Regularization」這邊可以抓到。

目前 Google 提出來的想法是想辦法在 56kbps 的頻寬下實現還堪用的視訊通話:

Pairing Lyra with new video compression technologies, like AV1, will allow video chats to take place, even for users connecting to the internet via a 56kbps dial-in modem.

這次的突破在於可以使用 3kbps 的頻寬傳輸,但清晰度比 Opus 的 6kbps 效果還好不少。

Google 在文章裡面給了兩個 sample,一個是乾淨背景音,另外一個是吵雜的背景音,跟 Opus 與 Speex 比起來都好很多。

論文是說不需要太高的運算力,但沒翻到 GitHub 之類的 source code,先當作參考:

We provide extensive subjective performance evaluations that show that our system based on generative modeling provides state-of-the-art coding performance at 3 kb/s for real-world speech signals at reasonable computational complexity.

AVIF 與 WebP 的懶人包設定?

看到「AVIF and WebP encoding quality settings」這包,看起來是 AVIFWebP 的懶人包設定。

一分鐘版的懶人包設定是基於一般 JPEG 的 quality 設定為 60 時的畫質,與 AVIF 的 50,WebP 的 65 差不多:

If you usually encode JPEGs with quality setting 60, then encode AVIF with quality setting 50 and WebP with quality setting 65. You should expect your AVIF files to be on average 36% smaller and your WebP images 15% smaller than the equivalent JPEG image.

後面給的複雜一點,包括了 JPEG quality 在 50/60/70/80 的情況。

作者用的是 DSSim 判斷圖片壓縮後的品質,看了維基百科裡面的說明,讓我想到 2016 年時 Netflix 公開的 VMAF,針對影片的品質分析:「Netflix 評估影片品質的方法」。

不過沒碰太多這塊的東西,不確定 DDSim 目前是否有被認可... 留下來當作參考。

用 Machine Learning 改善 Streaming 品質的服務與論文

Hacker News 上看到「Puffer」這個服務,裡面利用了 machine learning algorithm 動態調整 bitrate,以提昇傳輸品質。

測試得到的數據後來被整理起來一起放進論文:「Continual learning improves Internet video streaming」。

在開頭介紹了 Fugu 這個演算法:

We describe Fugu, a continual learning algorithm for bitrate selection in streaming video.

而 Puffer 就是實驗網站:

We evaluate Fugu with Puffer, a public website we built that streams live TV using Fugu and existing algorithms. Over a nine-day period in January 2019, Puffer streamed 8,131 hours of video to 3,719 unique users.

這個站台提供了許多真實的頻道進行測試:

Stream live TV in your browser. There's no charge. You can watch U.S. TV stations affiliated with the NBC, CBS, ABC, PBS, FOX, and Univision networks.

可以看到 Fugu 的結果很好,比起其他提出的方案是全面性的改善:

這邊是用 WebSocket 測試,並且配合了不同的 TCP congestion algorithm,沒有太考慮額外的計算成本...

OS X 接藍芽耳機要注意的地方...

Twitter 上看到 OS X 接藍芽耳機時的音質問題:

看了一些討論,看起來除了蘋果自己的耳機外,其他家的藍芽耳機不一定會開 AAC 或是 aptX。雖然現在沒有其他家的藍芽設備,但以後如果買了要注意一下...

也是拿來掃 PHP 程式碼的 PHPStan...

PHPStan 也是 PHP 的靜態分析工具,官方給的 slogan 是「PHP Static Analysis Tool - discover bugs in your code without running it!」。然後官方給了一個 GIF,直接看就大概知道在幹什麼了:

Phan 類似,也是要 PHP 7+ 才能跑,不過實際測試發現不像 Phan 需要 php-ast

PHPStan requires PHP ^gt;= 7.0. You have to run it in environment with PHP 7.x but the actual code does not have to use PHP 7.x features. (Code written for PHP 5.6 and earlier can run on 7.x mostly unmodified.)

PHPStan works best with modern object-oriented code. The more strongly-typed your code is, the more information you give PHPStan to work with.

Properly annotated and typehinted code (class properties, function and method arguments, return types) helps not only static analysis tools but also other people that work with the code to understand it.

拿上一篇「用 Phan 檢查 PHP 程式的正確性」的例子測試,也可以抓到類似的問題:

vendor/bin/phpstan analyse -l 7 src/
 1/1 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%

 ------ --------------------------------------------------------------
  Line   src/Foo.php
 ------ --------------------------------------------------------------
  13     Method Gslin\Foo::g() should return string but returns null.
 ------ --------------------------------------------------------------


 [ERROR] Found 1 error

這樣總算把積壓在 tab 上關於 PHP 工具都寫完了,之後要用才有地方可以翻... XD

用 Phan 檢查 PHP 程式的正確性

Phan 這套也是拿來檢查 PHP 程式用的,也是儘量避免丟出 false alarm。不過 Phan 只能用在 PHP 7+ 環境,原因是使用 php-ast,另外有一些額外建議要裝的套件:

This version (branch) of Phan depends on PHP 7.1.x with the php-ast extension (0.1.5 or newer, uses AST version 50) and supports PHP version 7.1+ syntax. Installation instructions for php-ast can be found here. For PHP 7.0.x use the 0.8 branch. Having PHP's pcntl extension installed is strongly recommended (not available on Windows), in order to support using parallel processes for analysis (or to support daemon mode).

最新版還只能跑在 PHP 7.2 上面,用的時候要注意一下 XD (我在測試時,require-dev 指定 0.11.0,結果被說只有 PHP 7.1 不符合 dependency,後來放 * 讓他去抓適合的版本)

像是這樣的程式碼:

class Foo
{
    /**
     * @param string $p
     * @return string
     */
    function g($p) {
        if (!$p) {
            return null;
        }
        return $p;
    }
}

就會產生出對應的警告訊息:

src/Foo.php:13 PhanTypeMismatchReturn Returning type null but g() is declared to return string

也是掛進 CI 裡面的好東西...