routing – Page 3 – Gea-Suan Lin's BLOG

AWS 在 2018 年要開大阪區

看起來 AWS 的官方稿只有日文版的消息 (雖然老大 Werner Vogels 在 Twitter 上有提到)：「【新リージョン】2018年に大阪ローカルリージョンを開設予定」。

皆様にうれしい発表があります。AWSは2018年に大阪に新たなリージョンを開設します。

理論上是能夠離台灣近一點，不過 routing 應該都還是從東京進去？先看看消息就好...

AWS 的 ALB (Application Load Balancer)

前幾天跟 AWS 的人開會的時候得知 ALB 的 beta program，今天就看到正式公開的消息了：「New – AWS Application Load Balancer」。

最主要的是對 WebSockets 與 HTTP/2 的支援，這個需求都喊很久了：

WebSocket allows you to set up long-standing TCP connections between your client and your server. This is a more efficient alternative to the old-school method which involved HTTP connections that were held open with a “heartbeat” for very long periods of time. WebSocket is great for mobile devices and can be used to deliver stock quotes, sports scores, and other dynamic data while minimizing power consumption. ALB provides native support for WebSocket via the ws:// and wss:// protocols.

HTTP/2 is a significant enhancement of the original HTTP 1.1 protocol. The newer protocol feature supports multiplexed requests across a single connection. This reduces network traffic, as does the binary nature of the protocol.

另外是 url routing，不過目前看起來只能設 10 條，我猜可以問問能不能加吧：

An Application Load Balancer has access to HTTP headers and allows you to route requests to different backend services accordingly. For example, you might want to send requests that include /api in the URL path to one group of servers (we call these target groups) and requests that include /mobile to another. Routing requests in this fashion allows you to build applications that are composed of multiple microservices that can run and be scaled independently.

As you will see in a moment, each Application Load Balancer allows you to define up to 10 URL-based rules to route requests to target groups. Over time, we plan to give you access to other routing methods.

再來是改善了之前抱怨很多的 health check：

Application Load Balancers can perform and report on health checks on a per-port basis. The health checks can specify a range of acceptable HTTP responses, and are accompanied by detailed error codes.

改進了不少東西...

前幾天大規模的 Routing-Hijacking 事件

前幾天 BGPmon 偵測到大規模的 routing-hijacking 事件：「Large hijack affects reachability of high traffic destinations」，範圍應該是最近最大的：

Our initial investigation shows that the scope of this incident is widespread and affected 576 Autonomous systems and 3431 prefixes. Amongst the networks affected are high traffic prefixes including those of Google, Amazon, Twitter, Apple, Akamai, Time Warner Cable Internet and more.

在「[outages] HTTP access to www.amazon.com is down for us」這邊也看得到當時有人發現問題。

LinkedIn 的工程師分析 TCP Anycast 技術的穩定性與效能

LinkedIn 的工程師測試了 TCP Anycast 技術的穩定性以及效能：「TCP over IP Anycast - Pipe dream or Reality?」。

由於 stateless 再加上一個封包就傳的完的情況下，Anycast 技術被用在 DNS 上已經很長一段時間了，目前大多數 CDN 業者也都有用 Anycast 技術加快 CDN 的回應速度。

但 TCP 因為 stateful，如果 router 上採用的方式有問題，那麼就會導致封包可能會送到不同節點，這會是個嚴重的問題。不過很早之前，幾乎所有的骨幹 router 都已經支援 flow-based load balancing policy：

Most routers now do a per-flow load balancing, meaning packets on a TCP connection are always sent over the same path, but even a small percentage of routers with per-packet load balancing can cause the website to be unreachable for users behind that router.

所以 LinkedIn 的人試著測試 TCP Anycast 技術的穩定性：

So, to validate the assumption that TCP over anycast in the modern internet is no longer a problem, we ran a few synthetic tests.

測試的方式是設定 web server，讓下載速度不快，然後設了好幾個點並且放出對應的 routing，用 Catchpoint 服務監控，如果不穩定的話，應該就會收到 RST 中斷連線：

We configured our U.S. PoPs to announce an anycast IP address and then configured multiple agents in Catchpoint, a synthetic monitoring service, to download an object from that IP address. Our web servers were configured to deliberately send the response back slowly, taking over a minute for the complete data transfer. If the internet was unstable for TCP over anycast, we would observe continuous or intermittent failures when downloading the object. We would also observe TCP RSTs at the PoPs.

而好消息是，測試起來相當穩定：

But even after running these tests for a week, we did not notice any substantial instability problems! This gave us confidence to proceed further.

所以也因此可以看到 CacheFly 與 CloudFlare 兩家採用 TCP Anycast 技術：

[S]ome popular CDNs have also started using anycast for HTTP traffic.

由於穩定性的部份沒問題，所以接下來就是討論效率。

Anycast 是基於 routing 而決定要怎麼走，目標是希望可以透過 routing 取得 latency 最低的點。但實務上會把成本考慮進去，有可能會走到比較遠的點。在測試中可以發現北美的部份 Anycast 表現的比 GeoIP 好，但離開北美就掉很多：

所以 LinkedIn 決定用「Regional Anycast」，先用 GeoIP 決定要丟到哪個洲，而每個洲共用一個 Anycast 位置，這個方法讓效能提昇不少，全球在分配時 sub-optimal 的比率從 31% 降到 10% (i.e. 沒有分配到最好的點的比率)：

上面主要是讀 LinkedIn 文章的心得，後面就是感想了。

TCP Anycast 用 CDN 上其實是相當吃虧的技術，由於 routing 的掌控權不再自己手上，有很多重要的手段是沒辦法做到的。

首先是當對外流量已經滿載時，不能切換到其他機房的機器，這邊講的「對外流量」不是 CDN 本身而已，而是中途任何的線路滿載都算，像是 HiNet 對 CloudFlare 香港機房的情況就很明顯。

另外在被 DDoS 時，由於沒辦法導流，在被攻擊時幾乎只剩下 clean pipe 類的解法，而同時間其他用戶會因為流量大量流入機房而一起被波及到。GeoIP 的方式彈性就大很多。

當然，還是有可以列出來的好處。主要是對於需要有固定 IP 應用來說 (像是 firewall 設定需求)，TCP Anycast 滿足了這點。

只能說不同市場有不同的產品線在供應啦，不同的情境下有不同的需求...

Hacking Team 的 BGP Routing Hijack

Hacking Team 的事情告訴我們，只能是能做的，都有人會包成 Total Solution 賣。

洩漏出來的資料說明了 Hacking Team 在 2013 年幹的 BGP Routing Hijack：「How Hacking Team Helped Italian Special Operations Group with BGP Routing Hijack」。

The Wikileaks document described how the Italian ROS reached out to Hacking Team to work together on recovering the VPS server that ran on 46.166.163.175. In ROS terminology, the server was called “Anonymizer”. The emails also revealed that this server relays updates to another back end server called “Collector” from which ROS presumably recovers the targets’ data.

然後：

When we look at historical BGP data we can confirm that AS31034 (Aruba S.p.A) indeed started to announce the prefix 46.166.163.0/24 starting on Friday, 16 Aug at 2013 07:32 UTC. The Wikileaks emails outline how ROS complained to Hacking Team that the IP was reachable only via Fastweb but not yet through Telecom Italia, concluding not all RCS clients were able to connect back to the server immediately, since the prefix was not seen globally. BGP data further confirms this per the visualization below.

這些主要的 ISP 分別是：

AS12874 Fastweb
AS6939 Hurricane Electric, Inc.
AS49605 Reteivo.IT
AS4589 Easynet
AS5396 MC-link Spa

時間線：

這也證明了「鎖 IP」的方法其實還是很危險的。

OpenStreetMap 支援路線規劃

OpenStreetMap 宣佈支援路線規劃：「Routing on OpenStreetMap.org」。

不過我拿台灣的地圖測試，好像不太行 XD

Route 53 的大改版

Amazon Route 53 的大改版：「Route 53 Update - Domain Name Registration, Geo Routing, and a Price Reduction」。

首先是可以註冊 domain，除了 web console 外，還可以透過 API 註冊：「Actions on Domain Registrations」。

看起來 privacy protection 的部份是跟 Gandi 合作：

Turns privacy protection on or off for the domain, determining whether WHOIS queries return contact information specified in the registrar record. If privacy protection is enabled, the query returns contact information for our registrar partner, Gandi, instead of the contact information that is specified in the registrar record.

沒看到可以註冊的 tld 的 API，但是網站 web console 連進去可以看到其實相當多... (不過目前沒看到 .tw)

另外一個大功能是 Geo Routing，可以選擇洲別，或是地區別。不過「美國本土」(海外的部份有另外分區) 與「中國」這兩個網路大國都各只有一區，而不是把再依照各州或各省細分... (有不少 CDN 所提供的 DNS 服務是把美國依照各州列出設定...)

但至少補上了這一塊，這樣可以用 Route53 配合 multi-CDN 的機制，而不需要自己刻了...

然後最後是 query 的價錢降價 20%：

Last, but certainly not least, I am happy to tell you that we have reduced the prices for Standard and LBR (Location-Based Routing) queries by 20%.

是該看看要不要撤掉 Zerigo，因為目前這塊最大的成本其實是報帳以及帳號控管，而非單純租用成本 :o

192.88.99.1 6to4 Anycast 的情況

紀錄 192.88.99.1 這組 6to4 anycast 的情況，看起來好像還是不太好 XD

HiNet：日本的 relay router。
SEEDNet：過 Cogent 到明尼蘇達大學。
TFN：HE。
NCTU：過 TWGate 到澳洲。
HCRC：學網內的點，但不知道是哪裡 XD