PChome 修正了問題,以及 RFC 4074 的說明

早些時候測試發現 PChome 已經修正了之前提到的問題:「PChome 24h 連線會慢的原因...」、「PChome 24h 連線會慢的原因... (續篇)」,這邊除了整理一下以外,也要修正之前文章裡的錯誤。

在 RFC 4074 (Common Misbehavior Against DNS Queries for IPv6 Addresses) 裡面提到了當你只有 IPv4 address 時,DNS server 要怎麼回應的問題。

在「3. Expected Behavior」說明了正確的作法,當只有 A RR 沒有 AAAA RR 的時候,應該要傳回 NOERROR,而 answer section 裡面不要放東西:

Suppose that an authoritative server has an A RR but has no AAAA RR for a host name. Then, the server should return a response to a query for an AAAA RR of the name with the response code (RCODE) being 0 (indicating no error) and with an empty answer section (see Sections 4.3.2 and 6.2.4 of [1]). Such a response indicates that there is at least one RR of a different type than AAAA for the queried name, and the stub resolver can then look for A RRs.

在「4.2. Return "Name Error"」裡提到,如果傳回 NXDOMAIN (3),表示查詢的這個名稱完全沒有 RR,而不僅僅限於 AAAA record,這就是我犯的錯誤 (在前面的文章建議傳回 NXDOMAIN):

This type of server returns a response with RCODE 3 ("Name Error") to a query for an AAAA RR, indicating that it does not have any RRs of any type for the queried name.

With this response, the stub resolver may immediately give up and never fall back. Even if the resolver retries with a query for an A RR, the negative response for the name has been cached in the caching server, and the caching server will simply return the negative response. As a result, the stub resolver considers this to be a fatal error in name resolution.

Several examples of this behavior are known to the authors. As of this writing, all have been fixed.

PChome 這次的修正回應了正確的值 (而不是我提到的 NXDOMAIN):

$ dig shopping.gs1.pchome.com.tw aaaa @ns1.gs1.pchome.com.tw

; <<>> DiG 9.9.5-3ubuntu0.16-Ubuntu <<>> shopping.gs1.pchome.com.tw aaaa @ns1.gs1.pchome.com.tw
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<<<- opcode: QUERY, status: NOERROR, id: 40767
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;shopping.gs1.pchome.com.tw.    IN      AAAA

;; AUTHORITY SECTION:
gs1.pchome.com.tw.      5       IN      SOA     ns1.gs1.pchome.com.tw. root.dns.pchome.com.tw. 20171123 3600 3 3600 5

;; Query time: 16 msec
;; SERVER: 210.242.216.91#53(210.242.216.91)
;; WHEN: Fri Nov 24 01:44:52 CST 2017
;; MSG SIZE  rcvd: 134

另外 RFC 也有一些其他的文件可以參考,像是 RFC 2308 (Negative Caching of DNS Queries (DNS NCACHE))、RFC 4697 (Observed DNS Resolution Misbehavior) 以及 RFC 8020 (NXDOMAIN: There Really Is Nothing Underneath),這些文件描述了蠻多常見的問題以及正確的處理方法,讀完對於現在愈來愈複雜的 DNS 架構有不少幫助。

PChome 24h 連線會慢的原因...

Update:續篇請參考「PChome 24h 連線會慢的原因... (續篇)」。

tl;dr:因為他們的 DNS servers 不會對 IPv6 的 AAAA record 正確的回應 NXDOMAIN,導致 DNS resolver 會不斷嘗試。

好像一行就把原因講完了啊,還是多寫一些細節好了。

起因於我的電腦連 PChome 24h 時常常會卡住,Google Chrome 會寫「Resolving host...」,於是就花了些時間找這個問題。

一開始先用幾個工具測試,發現 host 會卡,但不知道卡什麼:

$ host 24h.pchome.com.tw

tcpdump 出來聽的時候發現 host 會跑 AAAAA 以及 MX 三個種類,而後面兩個都會卡住:

24h.pchome.com.tw is an alias for shopping.gs1.pchome.com.tw.
shopping.gs1.pchome.com.tw has address 210.242.43.53
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached

這樣就有方向了... 我的電腦是 Dual-stack network (同時有 IPv4 address 與 IPv6 address),所以可以預期 Google Chrome 會去查 IPv6 address。而國內很多網站都還沒有把有 IPv6 的情境當標準測試,很容易中獎...

有了方向後,用 dig 測試 IPv6 的 AAAA,發現都是給 SERVFAIL,而且多跑幾次就發現會卡住:

$ dig 24h.pchome.com.tw aaaa @168.95.192.1

然後對 {cheetah,dns,dns2,dns3,wolf}.pchome.com.tw (上層登記的) 與 dns4.pchome.com.tw (實際多的) 測,可以拿到 CNAME record,像是這樣:

$ dig 24h.pchome.com.tw aaaa @dns.pchome.com.tw

; <<>> DiG 9.9.5-3ubuntu0.16-Ubuntu <<>> 24h.pchome.com.tw aaaa @dns.pchome.com.tw
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26037
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 6
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;24h.pchome.com.tw.             IN      AAAA

;; ANSWER SECTION:
24h.pchome.com.tw.      300     IN      CNAME   shopping.gs1.pchome.com.tw.

;; AUTHORITY SECTION:
gs1.pchome.com.tw.      300     IN      NS      ns3.gs1.pchome.com.tw.
gs1.pchome.com.tw.      300     IN      NS      ns1.gs1.pchome.com.tw.
gs1.pchome.com.tw.      300     IN      NS      ns4.gs1.pchome.com.tw.
gs1.pchome.com.tw.      300     IN      NS      ns5.gs1.pchome.com.tw.
gs1.pchome.com.tw.      300     IN      NS      ns2.gs1.pchome.com.tw.

;; ADDITIONAL SECTION:
ns1.gs1.pchome.com.tw.  300     IN      A       210.242.216.91
ns2.gs1.pchome.com.tw.  300     IN      A       210.242.216.92
ns3.gs1.pchome.com.tw.  300     IN      A       210.242.43.93
ns4.gs1.pchome.com.tw.  300     IN      A       203.69.38.91
ns5.gs1.pchome.com.tw.  300     IN      A       210.71.147.91

;; Query time: 12 msec
;; SERVER: 210.59.230.85#53(210.59.230.85)
;; WHEN: Wed Nov 22 11:05:24 CST 2017
;; MSG SIZE  rcvd: 243

但往 ns{1,2,3,4,5}.gs1.pchome.com.tw 問的時候給不出答案,也不給 NXDOMAIN,像是這樣:

$ dig shopping.gs1.pchome.com.tw aaaa @ns1.gs1.pchome.com.tw

; <<>> DiG 9.9.5-3ubuntu0.16-Ubuntu <<>> shopping.gs1.pchome.com.tw aaaa @ns1.gs1.pchome.com.tw
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36249
;; flags: qr rd ad; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 6
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;shopping.gs1.pchome.com.tw.    IN      AAAA

;; AUTHORITY SECTION:
gs1.pchome.com.tw.      3600    IN      NS      ns3.gs1.pchome.com.tw.
gs1.pchome.com.tw.      3600    IN      NS      ns4.gs1.pchome.com.tw.
gs1.pchome.com.tw.      3600    IN      NS      ns5.gs1.pchome.com.tw.
gs1.pchome.com.tw.      3600    IN      NS      ns1.gs1.pchome.com.tw.
gs1.pchome.com.tw.      3600    IN      NS      ns2.gs1.pchome.com.tw.

;; ADDITIONAL SECTION:
ns3.gs1.pchome.com.tw.  3600    IN      A       210.242.43.93
ns4.gs1.pchome.com.tw.  3600    IN      A       203.69.38.91
ns5.gs1.pchome.com.tw.  3600    IN      A       210.71.147.91
ns1.gs1.pchome.com.tw.  3600    IN      A       210.242.216.91
ns2.gs1.pchome.com.tw.  3600    IN      A       210.242.216.92

;; Query time: 11 msec
;; SERVER: 210.242.216.91#53(210.242.216.91)
;; WHEN: Wed Nov 22 11:07:17 CST 2017
;; MSG SIZE  rcvd: 310

於是 DNS resolver 就倒在路邊了...

AWS 提供 Hybrid Cloud 環境下 DNS 管理的說明

不知道為什麼出現在 browser tab 上,不知道是哪邊看到的... AWS 放出了一份文件,在講 hybrid cloud 環境下當你同時有一般 IDC 機房,而且使用內部 domain 在管理時,網路與 AWS 打通後要怎麼解決 DNS resolver 的問題:「Hybrid Cloud DNS Solutions for Amazon VPC」。

有些東西在官方的說明文件內都寫過,但是是 AWS 的特殊設計,這邊就會重複說明 XDDD

像是這份文件裡提到 Amazon DNS Server 一定會在 VPC 的 base 位置加二 (舉例來說,10.0.0.0/16 的 VPC,Amazon DNS Server 會在 10.0.0.2):

Amazon DNS Server
The Amazon DNS Server in a VPC provides full public DNS resolution, with additional resolution for internal records for the VPC and customer-defined Route 53 private DNS records.4 The AmazonProvidedDNS maps to a DNS server running on a reserved IP address at the base of the VPC network range, plus two. For example, the DNS Server on a 10.0.0.0/16 network is located at 10.0.0.2. For VPCs with multiple CIDR blocks, the DNS server IP address is located in the primary CIDR block.

在官方文件裡,則是在「DHCP Options Sets」這邊提到一樣的事情:

When you create a VPC, we automatically create a set of DHCP options and associate them with the VPC. This set includes two options: domain-name-servers=AmazonProvidedDNS, and domain-name=domain-name-for-your-region. AmazonProvidedDNS is an Amazon DNS server, and this option enables DNS for instances that need to communicate over the VPC's Internet gateway. The string AmazonProvidedDNS maps to a DNS server running on a reserved IP address at the base of the VPC IPv4 network range, plus two. For example, the DNS Server on a 10.0.0.0/16 network is located at 10.0.0.2. For VPCs with multiple IPv4 CIDR blocks, the DNS server IP address is located in the primary CIDR block.

另外也還是有些東西在官方的說明文件內沒看過,像是講到 Elastic Network Interface (ENI) 對 Amazon DNS Server 是有封包數量限制的;這點我沒在官方文件上找到,明顯在量太大的時候會中獎,然後開 Support Ticket 才會發現的啊 XDDD:

Each network interface in an Amazon VPC has a hard limit of 1024 packets that it can send to the Amazon Provided DNS server every second.

Anyway... 這份文件裡面提供三種解法:

  • Secondary DNS in a VPC,直接用程式抄一份到 Amazon Route 53 上,這樣 Amazon DNS Server 就可以直接看到了,這也是 AWS 在一般情況下比較推薦的作法。
  • Highly Distributed Forwarders,每台 instance 都跑 Unbound,然後針對不同的 domain 導開,這樣可以有效避開單一 ENI 對 Amazon DNS Server 的封包數量限制,但缺點是這樣的設計通常會需要像是 Puppet 或是 Chef 之類的軟體管理工具才會比較好設定。
  • Zonal Forwarders Using Supersede,就是在上面架設一組 Unbound 伺服器集中管理,透過 DHCP 設定讓 instance 用。但就要注意量不能太大,不然 ENI 對 Amazon DNS Server 的限制可能會爆掉 XD

都可以考慮看看...

新的 DNS Resolver:9.9.9.9

看到新的 DNS Resolver 服務,也拿到了還不錯的 IP address,9.9.9.9:「New “Quad9” DNS service blocks malicious domains for everyone」,服務網站是「Quad 9 | Internet Security and Privacy in a Few Easy Steps」,主打宣稱過濾已知的危險站台...

由政府單位、IBM 以及 Packet Clearing House 成立的:

The Global Cyber Alliance (GCA)—an organization founded by law enforcement and research organizations to help reduce cyber-crime—has partnered with IBM and Packet Clearing House to launch a free public Domain Name Service system.

也就是說,後面三家都不是專門做網路服務的廠商... 於是就會發現連 Client Subnet in DNS Queries (RFC 7871) 都沒提供,於是查出來的地區都不對,這對使用 DNS resolver 位置分配 CDN 節點的服務很傷啊... (或是其他類似服務)

這是 GooglePublic DNS (8.8.8.8) 查出來的:

;; ANSWER SECTION:
i.kfs.io.               576     IN      CNAME   kwc.kkcube.com.country.mp.kkcube.com.
kwc.kkcube.com.country.mp.kkcube.com. 21599 IN CNAME TW.kwc.kkcube.com.
TW.kwc.kkcube.com.      188     IN      CNAME   i.kfs.io.cdn.cloudflare.net.
i.kfs.io.cdn.cloudflare.net. 299 IN     A       104.16.244.238
i.kfs.io.cdn.cloudflare.net. 299 IN     A       104.16.245.238

;; Query time: 28 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Nov 18 05:30:23 CST 2017
;; MSG SIZE  rcvd: 181

這是 Quad9 (9.9.9.9) 查出來的:

;; ANSWER SECTION:
i.kfs.io.               1800    IN      CNAME   kwc.kkcube.com.country.mp.kkcube.com.
kwc.kkcube.com.country.mp.kkcube.com. 42702 IN CNAME US.kwc.kkcube.com.
US.kwc.kkcube.com.      300     IN      CNAME   i.kfs.io.cdn.cloudflare.net.
i.kfs.io.cdn.cloudflare.net. 300 IN     A       104.16.245.238
i.kfs.io.cdn.cloudflare.net. 300 IN     A       104.16.244.238

;; Query time: 294 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Sat Nov 18 05:30:27 CST 2017
;; MSG SIZE  rcvd: 181

再來一點是,在科技領域相信政府單位通常都是一件錯誤的事情,我 pass... XD

Google DNS 在台灣的 Latency 大幅降低

對 8.8.8.8 的 ICMP 數據可以發現今天早上 Google 台灣機房的 DNS Service 做了改變,整個 latency 大幅降低。(因為 mtr 發現後面 traceroute 不到以為爆炸了,跑去看 Smokeping 發現其實是 Google 改善了架構)

中華電信北區的機房:

中華電信南區的機房:

台固機房:(內湖)

遠傳機房:(忘記是內湖的哪個機房了...)

遊戲橘子機房:(台北區)

Amazon CloudFront 支援 EDNS-Client-Subnet

Amazon CloudFront 今天的新聞:「Improved CloudFront Performance with EDNS-Client-Subnet Support」。

目前 CDN 大多都還是靠 GeoDNS 技術達到分流技術,但另外一方面,Google 的 Public DNSOpenDNS 服務不一定在每個 ISP 都有設機房,這使得 CDN 服務無法靠 DNS server 的 IP address 正確導到離使用者 ISP 最近的 CDN (也就是原文指的 sub-optimal performance)。

EDNS-Client-Subnet 是一個 draft,讓 DNS resolver 可以把 client 的網段資訊帶給 CDN 的 DNS server,進而改善定位。

剛剛測試發現 Akamai 也支援了,在「OpenDNS: Why doesn't Akamai support the edns client subnet extension? This would allow OpenDNS and Google DNS to more effectively provide the geo location of the CDN and allow Akamai to send content to clients at higher speed.」這篇也有提到。

以前會鼓勵使用自己家的 DNS resolver 的理由又要消失一個了 :p

Percona XtraDB Cluster 上用 keepalived 實作高可靠度架構

Percona 的人寫了一篇「Using keepalived for HA on top of Percona XtraDB Cluster」,是用 Keepalived 實作高可靠度架構。

目前常見的文章是透過 HAProxy 達成,但 HAProxy 本身就會成為 SPOF,所以一般人的想法就是在 HAProxy 上跑 Keepalived,保持可靠度。但這個問題我之前就抱怨過,既然要跑 Keepalived (我的例子裡是 Heartbeat),為什麼不在 Percona XtraDB Cluster 上跑就好,load balance 的任務則可以透過 DNS round robin 達成:

We would standardly use keepalived to HA HAproxy, and keepalived supports track_scripts that can monitor whatever we want, so why not just monitor PXC directly?

不過我還是偏愛 Heartbeat,因為 Heartbeat 的功能多了一些,不過架構還是很簡單,實際用起來也發現不容易出問題。

另外 Heartbeat 不論是在 Linux 還是 FreeBSD 上都跑得很好,我連內部機器用的 DNS resolver (跑 Unboound) 都上 Heartbeat。

另外也因為 Keepalived 目前不支援 FreeBSD (參考「About "ipvs" and "keepalived"」這篇文章),所以也沒打算花時間測試了... :o