現在的 vm.swappiness

查了一下現在的 vm.swappiness,發現跟以前又有一些差異。在「Documentation for /proc/sys/vm/」這邊可以看到說明。

很久以前應該是 0100,現在變成 0200 了,其中設定成 100 會在比重公式裡讓 memory 與 swap 的計算上有相同的比重:

This control is used to define the rough relative IO cost of swapping and filesystem paging, as a value between 0 and 200. At 100, the VM assumes equal IO cost and will thus apply memory pressure to the page cache and swap-backed pages equally; lower values signify more expensive swap IO, higher values indicates cheaper.

另外是設定 0 時的方式,在不夠用的時候還是會去用:

At 0, the kernel will not initiate swap until the amount of free and file-backed pages is less than the high watermark in a zone.

目前看起來之前建議設成 1 的方式應該是還 OK...

Swap 對 InnoDB 的影響

Percona 的老大拿 5.7 版做實驗,確認 swap 對 InnoDB 的影響:「The Impact of Swapping on MySQL Performance」。

測試的機器是 32GB RAM,作業系統 (以及 swap) 裝在已經有點年紀的 Intel 520 SSD 上,而 MySQL 則是裝在 Intel 750 NVMe 上。透過對 innodb_buffer_pool 的調整來看情況。

可以看到設為 24GB (記憶體 75% 的量) 時很穩定的在 44K QPS 與 3.5ms (95%):

This gives us about 44K QPS. The 95% query response time (reported by sysbench) is about 3.5ms.

而當設成 32GB 的時候開始可以觀察到 swap i/o,掉到 20K QPS 與 9ms (95%):

We can see that performance stabilizes after a bit at around 20K QPS, with some 380MB/sec disk IO and 125MB/sec swap IO. The 95% query response time has grown to around 9ms.

當拉到 48GB 的時候就更掉更多,6K QPS 與 35ms (95%):

Now we have around 6K QPS. Disk IO has dropped to 250MB/sec, and swap IO is up to 190MB/sec. The 95% query response time is around 35ms.

作者發現掉的比率沒有想像中大:

When I started, I expected severe performance drop even with very minor swapping. I surprised myself by getting swap activity to more than 100MB/sec, with performance “only” halved.

這邊測試用的是 SSD,如果是傳統用磁頭的硬碟,對 random access 應該會很敏感而掉更多:

This assumes your swap space is on an SSD, of course! SSDs handle random IO (which is what paging activity usually is) much better than HDDs.

基本上還是要避免碰到 swap 啦,另外 comment 的地方剛好有提到前陣子在猜測的 best practice,測試時的 vm.swappiness 是設成 1,這應該是作者的 best practice:

Swappiness was set to 1 in this case. I was not expecting this to cause significant impact as swapping is caused by genuine (intended) missconfiguration with more memory required than available.

vm.swappiness 設成 1 或是 0 的差異

在「MongoDB System Tuning Best Practices」這份投影片裡面看到:

To avoid disk-based swap: 1 (not zero!)

以及:

‘0’ can cause unpredicted behaviour

在 kernel 的說明文件是這樣描述,設成 0 時表示只有在避免 oom 時才會 swap:

This control is used to define how aggressive the kernel will swap memory pages. Higher values will increase agressiveness, lower values decrease the amount of swap. A value of 0 instructs the kernel not to initiate swap until the amount of free and file-backed pages is less than the high water mark in a zone.

設成 1 的想法頗有趣的,來看看在 MySQL 上是不是也有同樣的情況要注意...