Home » Posts tagged "query"

MySQL 版本的 Amazon Aurora 會將各種記錄丟到 CloudWatch Logs 了...

剛好今天才被問是不是可以在 Amazon Aurora (MySQL-Compatible Edition) 裡面翻出有哪些 Slow Query,剛好想到這幾天發表了這個功能:「Amazon Aurora Publishes General, Slow Query and Error Logs to Amazon CloudWatch」。

You can now configure the MySQL-compatible edition of Amazon Aurora to publish general logs, slow query logs, and error logs to Amazon CloudWatch Logs. Previously, you could only publish audit logs.

看起來是要另外開 (畢竟 CloudWatch Logs 不是免費的 XD),不過以這類型的 log 產生速度與數量來說應該還行...

Amazon Aurora (MySQL) 提供 Parallel Query 讓人申請使用

AWS 宣佈了 Amazon Aurora (MySQL) 支援 Parallel Query:「Amazon Aurora Parallel Query is Available for Preview」。

這邊提到的 Parallel Query 比較像是 Amazon Athena,直接把單一 Query 打散到多台機器上跑:

Amazon Aurora Parallel Query improves the performance of large analytic queries by pushing processing down to the Aurora storage layer, spreading processing across hundreds of nodes.

也就是說,這算是單一 SQL Query 平行運算的進階版本。

在這之前,AWS 都已經支援單一 Query 在單台機器上利用多 CPU 平行運算。其中 PostgreSQL 是 9.6+ 本身就有支援。Amazon Aurora (MySQL) 則是在 2016 時透過 Parallel Read Ahead 支援某些情境下的的單一 Query 多 CPU 運算了 (發現之前沒寫到...):「Amazon Aurora Update – Parallel Read Ahead, Faster Indexing, NUMA Awareness」。

這個功能目前是 Preview 階段,然後開在這些地區讓大家測試使用:

The preview is available for the MySQL-compatible edition of Amazon Aurora, and is currently available in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland) Regions. Sign up to get access.


Happy Eyeballs (RFC 6555)

在「PChome 24h 連線會慢的原因...」這篇的 comment 有讀者提到了 Happy Eyeballs 應該可以解決這個問題:

除了可以在維基百科上面看到外,比較正式的說明可以參考 RFC 6555:「Happy Eyeballs: Success with Dual-Stack Hosts」,其中在「6. Example Algorithm」就有提到 Google ChromeMozilla Firefox 怎麼實做 Happy Eyeballs:

What follows is the algorithm implemented in Google Chrome and Mozilla Firefox.

  1. Call getaddinfo(), which returns a list of IP addresses sorted by the host's address preference policy.
  2. Initiate a connection attempt with the first address in that list (e.g., IPv6).
  3. If that connection does not complete within a short period of time (Firefox and Chrome use 300 ms), initiate a connection attempt with the first address belonging to the other address family (e.g., IPv4).
  4. The first connection that is established is used. The other connection is discarded.

If an algorithm were to cache connection success/failure, the caching would occur after step 4 determined which connection was successful.

Other example algorithms include [Perreault] and [Andrews].

可以看到 Happy Eyeballs 的演算法是要避免 IPv6 network 不通的情況卡住很慢 (如果在 300ms 內連線沒有建起來,就會儘快往另外一個 address family 嘗試),而不是在 DNS 層避免問題 (也就是 getaddinfo() 觸發的 DNS query)。

這次的情況是 DNS query 很慢,就會導致還是一開始就很慢,Happy Eyeballs 沒辦法解決這個問題。

不過話說回來,我是有印象知道有這個演算法,但不知道有「Happy Eyeballs」這個這麼逗趣的名字... (掩面)

新的 DNS Resolver:

看到新的 DNS Resolver 服務,也拿到了還不錯的 IP address,「New “Quad9” DNS service blocks malicious domains for everyone」,服務網站是「Quad 9 | Internet Security and Privacy in a Few Easy Steps」,主打宣稱過濾已知的危險站台...

由政府單位、IBM 以及 Packet Clearing House 成立的:

The Global Cyber Alliance (GCA)—an organization founded by law enforcement and research organizations to help reduce cyber-crime—has partnered with IBM and Packet Clearing House to launch a free public Domain Name Service system.

也就是說,後面三家都不是專門做網路服務的廠商... 於是就會發現連 Client Subnet in DNS Queries (RFC 7871) 都沒提供,於是查出來的地區都不對,這對使用 DNS resolver 位置分配 CDN 節點的服務很傷啊... (或是其他類似服務)

這是 GooglePublic DNS ( 查出來的:

i.kfs.io.               576     IN      CNAME   kwc.kkcube.com.country.mp.kkcube.com.
kwc.kkcube.com.country.mp.kkcube.com. 21599 IN CNAME TW.kwc.kkcube.com.
TW.kwc.kkcube.com.      188     IN      CNAME   i.kfs.io.cdn.cloudflare.net.
i.kfs.io.cdn.cloudflare.net. 299 IN     A
i.kfs.io.cdn.cloudflare.net. 299 IN     A

;; Query time: 28 msec
;; WHEN: Sat Nov 18 05:30:23 CST 2017
;; MSG SIZE  rcvd: 181

這是 Quad9 ( 查出來的:

i.kfs.io.               1800    IN      CNAME   kwc.kkcube.com.country.mp.kkcube.com.
kwc.kkcube.com.country.mp.kkcube.com. 42702 IN CNAME US.kwc.kkcube.com.
US.kwc.kkcube.com.      300     IN      CNAME   i.kfs.io.cdn.cloudflare.net.
i.kfs.io.cdn.cloudflare.net. 300 IN     A
i.kfs.io.cdn.cloudflare.net. 300 IN     A

;; Query time: 294 msec
;; WHEN: Sat Nov 18 05:30:27 CST 2017
;; MSG SIZE  rcvd: 181

再來一點是,在科技領域相信政府單位通常都是一件錯誤的事情,我 pass... XD

Amazon Aurora (MySQL) 推出的 Asynchronous Key Prefetch

Amazon Aurora (MySQL) 推出新的效能改善,可以改善 JOIN 時的效能:「Amazon Aurora (MySQL) Speeds Join Queries by More than 10x with Asynchronous Key Prefetch」。

看起來像是某個情況的 optimization,將可能的 random access 換成 sequential access 而得到大量的效能:

This feature applies to queries that require use of the Batched Key Access (BKA) join algorithm and Multi-Range Read (MRR) optimization, and improves performance when the underlying data set is not in the main memory buffer pool or query cache.

其實記憶體還是最好用的加速器,能加大硬拼就先硬拼... XD

MySQL 總算要拔掉 mysql_query_cache 了

半官方的 MySQL blog 上宣佈了拔掉 mysql_query_cache 的計畫:「MySQL 8.0: Retiring Support for the Query Cache」。

作者開頭引用了 ProxySQL 的人對 MySQL Query Cache 的說明:

Although MySQL Query Cache was meant to improve performance, it has serious scalability issues and it can easily become a severe bottleneck.

主要問題在於 MySQL Query Cache 在多 CPU 環境下很難 scale,很容易造成一堆 thread 在搶 lock。而且作者也同意 ProxySQL 的說法,將 cache 放到 client 的效能比較好:

We also agree with Rene’s conclusion, that caching provides the greatest benefit when it is moved closer to the client:

可以看到 Query Cache 在複雜的環境下對效能極傷。而之前也提到過類似的事情了:「Percona 對 mysql_query_cache 的測試 (以 Magento 為例)」、「關閉 MySQL 的 Query Cache」。

一般如果要 cache 的話,透過 InnoDB 裡良好的 index 應該還可以撐不少量起來。

Swap 對 InnoDB 的影響

Percona 的老大拿 5.7 版做實驗,確認 swap 對 InnoDB 的影響:「The Impact of Swapping on MySQL Performance」。

測試的機器是 32GB RAM,作業系統 (以及 swap) 裝在已經有點年紀的 Intel 520 SSD 上,而 MySQL 則是裝在 Intel 750 NVMe 上。透過對 innodb_buffer_pool 的調整來看情況。

可以看到設為 24GB (記憶體 75% 的量) 時很穩定的在 44K QPS 與 3.5ms (95%):

This gives us about 44K QPS. The 95% query response time (reported by sysbench) is about 3.5ms.

而當設成 32GB 的時候開始可以觀察到 swap i/o,掉到 20K QPS 與 9ms (95%):

We can see that performance stabilizes after a bit at around 20K QPS, with some 380MB/sec disk IO and 125MB/sec swap IO. The 95% query response time has grown to around 9ms.

當拉到 48GB 的時候就更掉更多,6K QPS 與 35ms (95%):

Now we have around 6K QPS. Disk IO has dropped to 250MB/sec, and swap IO is up to 190MB/sec. The 95% query response time is around 35ms.


When I started, I expected severe performance drop even with very minor swapping. I surprised myself by getting swap activity to more than 100MB/sec, with performance “only” halved.

這邊測試用的是 SSD,如果是傳統用磁頭的硬碟,對 random access 應該會很敏感而掉更多:

This assumes your swap space is on an SSD, of course! SSDs handle random IO (which is what paging activity usually is) much better than HDDs.

基本上還是要避免碰到 swap 啦,另外 comment 的地方剛好有提到前陣子在猜測的 best practice,測試時的 vm.swappiness 是設成 1,這應該是作者的 best practice:

Swappiness was set to 1 in this case. I was not expecting this to cause significant impact as swapping is caused by genuine (intended) missconfiguration with more memory required than available.