## Golomb coding

Hacker News 上看到 id=40008133 這篇提到 Golomb coding 覺得很有趣，花了些時間把演算法以及使用的場景搞清楚...

Golomb coding 用在 input 是數字 (或是 fixed size 時硬當作一個數字處理)，而且大多數的資料都偏小的情境下，像是「I don't get Golomb / Rice coding: It does make more bits of the input, or does it?」這邊解釋 Golomb coding 時提到的：

they reduce the average length per encoded value compared to fixed-width encoding, if the encoded values are from a large range, but the most common values are generally small (and hence are using only a small fraction of that range most of the time).

Golomb coding 演算法上用到了兩個特殊的表達方式，Unary codingTruncated binary encoding

Unary coding 很簡單，把 n 表達成 n 個 bit 的 `1`，再加上一個 `0`，所以 3 表達成 `1110`，7 表達成 `11111110`

Unary coding 最主要就是要用到他只用一個 bit `0` 來表達 n = 0 的情況。

Truncated binary encoding (中文維基百科剛好有條目：「截斷二進制編碼」) 則是很有趣的一個 encoding 方式。先提到傳統的方式，在表示 n 種可能的組合 $(0, 1, ..., n-1)$ 時，會需要 $\lceil\log_2 {n}\rceil$ bit 表達，像是 $n = 5$ 時，會需要 3 個 bit 表達，從 `000``001``010``011``100`

Truncated binary encoding 則是找到一個方法，編碼成 `00` (對應到 0)、`01` (1)、`10` (2)、`110` (3)、`111` (4)，針對 n = 0～2 的部分只用 2 bits 表示，省下 1 個 bit。

Truncated binary encoding 的重點在於在可能性非 $2^n$ 的情況下，要怎麼省儲存空間。

$x = M * q + r$

5.1 Using a 64 Bit Hash Function

To fulfill the requirement of being able to estimate multisets of cardinalities beyond 1 billion, we use a 64 bit hash function.

It would be possible to introduce a similar correction if the cardinality
approaches 264, but it seems unlikely that such cardinalities are encountered in practice.

5.2 Estimating Small Cardinalities

5.3 Sparse Representation

Example: Actual answer is within 5 ± 1 with prob ≥ 0.9

## Redis 對 HyperLogLog 省空間的實作

HyperLogLog (HLL) 是用統計方式解決 Count-distinct problem 的資料結構以及演算法，不要求完全正確，而是大概的數量。

The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory; 12k bytes in the worst case, or a lot less if your HyperLogLog (We'll just call them HLL from now) has seen very few elements.

The use of 16384 6-bit registers for a great level of accuracy, using a total of 12k per key.

In the example the sparse representation used just 7 bytes instead of 12k in order to represent the HLL registers. In general for low cardinality there is a big win in terms of space efficiency, traded with CPU time since the sparse representation is slower to access.

## 用 Automerge 處理 CRDT 問題

```(() => {
const Automerge = require('@automerge/automerge');

let doc1 = Automerge.init();

doc1 = Automerge.change(doc1, 'Init', doc => {
doc.text = new Automerge.Text();
doc.text.insertAt(0, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd', '.');
});

let doc2 = Automerge.clone(doc1);
let doc3 = Automerge.clone(doc1);

doc1 = Automerge.change(doc1, 'Capitalize', doc => {
doc.text.deleteAt(0);
doc.text.insertAt(0, 'H');
});
doc2 = Automerge.change(doc2, 'world => test', doc => {
delete doc.text.deleteAt(7, 5);
doc.text.insertAt(7, 'test');
});
doc3 = Automerge.change(doc3, 'world => example', doc => {
delete doc.text.deleteAt(7, 5);
doc.text.insertAt(7, 'example');
});

let finalDoc = Automerge.merge(doc1, doc2);
finalDoc = Automerge.merge(finalDoc, doc3);
console.log(finalDoc);
})();```

```{
text: Text {
elems: [
'H', 'e', 'l', 'l', 'o',
',', ' ', 't', 'e', 's',
't', 'e', 'x', 'a', 'm',
'p', 'l', 'e', '.'
]
}
}```

## Ribbon filter

RocksDB 的 blog 上看到「Ribbon Filter」這個，主要是 RocksDB 從 6.15 開始支援 Ribbon filter，取代本來的 Bloom filter 機制。

RocksDB 的 wiki 上面是說用 CPU 資源換 memory 的使用量：

A new Bloom filter alternative is available as a drop-in replacement (since version 6.15.0), saving about 30% of Bloom filter space (most importantly, memory) but using about 3-4x as much CPU on filters. Most of the additional CPU time is in the background jobs constructing the filters, and this is usually a good trade because it is common for SST filters to use ~10% of system RAM and well under 1% of CPU.

## Memcached 與 Redis 的比較

Memcached 的部份先提了 page/slab/chunk 的架構以及產生的效能限制與浪費，接著有提到 2020 年 refactor 的部份 (太久沒有看 Memcached 的消息，去年沒跟到這個部份)，讓多 CPU 的支援度更好。

Redis 則是靠 jemalloc 來處理這個部份，另外加上 background thread 的機制降低 fragment。

## 用顏色區分程式碼裡面的括弧

Hacker News Daily 上看到 VSCode 改善了 bracket pair colorization 效率的文章，才想到我的 Vim 好像沒裝這個功能：「Bracket pair colorization 10,000x faster」。

VSCode 這邊主要是引入了新的資料結構改善了計算量，有興趣的可以看一下：

Efficient bracket pair colorization was a fun challenge. With the new data structures, we can also solve other problems related to bracket pairs more efficiently, such as general bracket matching or showing colored line scopes.

## Akamai 也推出了 Key-Value 服務 EdgeKV

EdgeWorkers lets developers just code — integrating into existing CI/CD workflows and enabling multiple teams to work in parallel using JavaScript. EdgeWorkers eliminates the hassle of managing compute resources and building for scale.

EdgeKV uses what is known in distributing computing as an eventual consistency model to perform writes and updates. This model achieves high availability with low read latency by propagating data writes globally. The period of time it takes the system to distribute data globally is called the “inconsistency window”.

KV achieves this performance by being eventually-consistent. Changes are immediately visible in the edge location at which they're made, but may take up to 60 seconds to propagate to all other edge locations.

## GTA 的啟動讀取效能問題

```struct {
uint64_t *hash;
item_t   *item;
} entry;```

But before it’s stored? It checks the entire array, one by one, comparing the hash of the item to see if it’s in the list or not. With ~63k entries that’s (n^2+n)/2 = (63000^2+63000)/2 = 1984531500 checks if my math is right. Most of them useless. You have unique hashes why not use a hash map.

• hook strlen
• wait for a long string
• “cache” the start and length of it
• if it’s called again within the string’s range, return cached value

And as for the hash-array problem, it’s more straightforward - just skip the duplicate checks entirely and insert the items directly since we know the values are unique.

JdeBP 3 days ago

I found this while making a collection of what C implementation does what at https://news.ycombinator.com/item?id=26298300.

There are two basic implementation strategies. The BSD (FreeBSD and OpenBSD and more than likely NetBSD too), Microsoft, GNU, and MUSL C libraries use one, and suffer from this; whereas the OpenWatcom, P.J. Plauger, Tru64 Unix, and my standard C libraries use another, and do not.

The 2002 report in the comp.lang.c Usenet newsgroup (listed in that discussion) is the earliest that I've found so far.

## Python 3.7+ 保證 dict 內容的順序

Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.