可以用 IAM 控管的 AWS Instance Connect (透過 SSH 機制)

除了本來的 Systems Manager 可以在 EC2 的機器上開 shell 管理外,現在 AWS 推出了 EC2 Instance Connect,可以直接綁 IAM 的權限管理:「Introducing Amazon EC2 Instance Connect」。

With EC2 Instance Connect, you can control SSH access to your instances using AWS Identity and Access Management (IAM) policies as well as audit connection requests with AWS CloudTrail events. In addition, you can leverage your existing SSH keys or further enhance your security posture by generating one-time use SSH keys each time an authorized user connects. Instance Connect works with any SSH client, or you can easily connect to your instances from a new browser-based SSH experience in the EC2 console.

除了記錄外,也包含了一些安全機制,像是可以選擇一次性的帳號... 跟先前的 Systems Manager 比起來,主要是能用習慣的 terminal software 還是比較爽?

AWS Direct Connect 在台北又增加可以接的機房了...

先前 AWS Direct Connect 在今年二月的時候公佈了台北第一個開放的接點,是方的麗源機房 (Chief Telecom LY, Taipei, Taiwan,參考「AWS 公開了在台北的 Direct Connect 接口」),現在公佈第二個點了:「New AWS Direct Connect locations in Paris and Taipei」。

第二個點在中華電信的機房 (Chunghwa Telecom, Taipei, Taiwan),但這樣只這樣標,不知道實際上是哪個,會不會是愛國東路的國分機房?那邊有不少非中華電信網路的客戶,只是租那邊的機房在用...

不過有需求的可以問一下 AWS,應該就會有答案了,畢竟總是得讓客戶知道要接到哪邊...

AWS 公開了在台北的 Direct Connect 接口

也是個很久前就聽到傳言的消息... AWS 剛剛公佈了在台北的 Direct Connect 接口,用戶可以在台北內租用線路進機房就連上 Direct Connect:「New AWS Direct Connect sites land in Paris and Taipei」。

AWS Direct Connect also launched its first site in Taiwan at Chief Telecom LY, Taipei. In the Management Console, Taipei is located in the Asia Pacific (Tokyo) Region. With global access enabled for AWS Direct Connect, these sites can reach AWS resources in any global AWS region using global public VIFs and Direct Connect Gateway.

以往需要透過像 GCX 這樣的公司租用國際頻寬,再從 GCX 在台北的機房拉到自家機房 (通常是市內專線),現在只需要在台北對接的這段就可以了。

台北的接點在 AWS 上寫的是 Chief Telecom LY, Taipei, Taiwan,查了一下是是方電訊的「是方麗源大樓 (台北市內湖區陽光街250號)」,也就是之前有發生過火災而大斷線的那棟。對接的 AWS Home Region 是 Asia Pacific (Tokyo),所以使用 ap-northeast-1 的人可以規劃...

AWS Direct Connect 可以多條線路綁在一起用了

AWS Direct Connect 支援 Link Aggregation Group (LAG),將多條線路綁在一起了:「AWS Direct Connect enables Link Aggregation Group for additional AWS regions」。

We are excited to announce support for 1G and 10G Link Aggregation Groups (LAG). Customers in AWS GovCloud (US), Europe (Frankfurt), Europe (London), Europe (Ireland), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney) regions can start using LAG to link existing connections on the same AWS device, or request new connections.

主要還是對於超大量的用戶會變方便 (都是使用 10Gbps 線路接上去的使用者),尤其是 load balancing 這塊會好做不少。

多條 1Gbps 的當然也是有幫助,不過剛好只會落在一個微妙的範圍 (到 8Gbps 的時候,直接換 10Gbps 成本說不定其實差不多...)。

Netflix 的 Open Connect 架構

在「研究人员绘制Netflix CDN服务器位置图」這邊看到在討論 NetflixOpen Connect 架構,也就是 Netflix 自己的 CDN 架構。引用的論文是「Open Connect Everywhere: A Glimpse at the Internet Ecosystem through the Lens of the Netflix CDN」這篇。

翻了一下論文,看起來還是頗傳統的方法,從一堆伺服器去 query 找出來:

Indeed, Netflix server domain names are too specific, i.e., including details on the physical location of the server. This makes it unlikely that redirection takes place based on these domain names. Additionally, we ran a distributed DNS lookup campaign from Planetlab nodes to verify our assumption that Netflix DNS names always resolve to a single IP address, independently of the client location.

另外也有用窮舉法去查:

We implemented a crawler, which generated DNS names following the pattern outlined above and attempted to resolve those generated domains names to IP addresses. The crawler was fed with lists of known countries, airport codes and ISPs. We curated these lists manually, consulting publicly available lists of countries and airport codes. For the ISPs, we conducted an extensive search on the Internet, including but not limited to the Netflix ISP Speed Index, Wikipedia, various ISP rankings or recommendation lists and discussions in web forums.

所以總共列了 243 個國家地區,3468 個機場編號 (IATA) 與 1728 個 ISP 去搜尋:

In total, our curated lists contained 243 countries, 3,468 airport codes and 1,728 ISP names. In total we tested 16 different network connection types.

不過也有發現到錯字:

The DNS names actually include a typo, the Carrasco International Airport in Uruguay is abbreviated MDV instead of MVD. We used the ISP names found to verify that what was really meant is the airport in Uruguay.

然後找出了:

不過事實上因為 Netflix 已經規劃全面走 HTTPS:「Netflix 對 sendfile() 在 TLS 情況下的加速」、「Protecting Netflix Viewing Privacy at Scale」,所以在 Certificate Transparency%.nflxvideo.net 會有完整的列表,看起來遠超過這篇論文所給的數字:

gslin@home [~/tmp] [14:55/W5] grep isp.nflxvideo nflxvideo.html | sort | uniq | wc
   1311    1311   65827
gslin@home [~/tmp] [14:55/W5] grep ix.nflxvideo nflxvideo.html | sort | uniq | wc
     77      77    3230

來寫信給論文作者提看看好了...

Netflix 對 sendfile() 在 TLS 情況下的加速

Netflix 對於寫了一篇關於隱私保護的技術細節:「Protecting Netflix Viewing Privacy at Scale」。

其中講到 2012 年的 Netflix Open Connect 中的 Open Connect Appliance (OCA,放伺服器到 ISP 機房的計畫) 只有單台伺服器 8Gbps,到現在 2016 可以達到 90Gbps:

As we mentioned in a recent company blog post, since the beginning of the Open Connect program we have significantly increased the efficiency of our OCAs - from delivering 8 Gbps of throughput from a single server in 2012 to over 90 Gbps from a single server in 2016.

早期的 Netflix 走 sendfile() 將影片丟出去,這在 kernel space 處理,所以很有效率:

當影片本身改走 HTTPS (TLS) 時,其中一個遇到的效能問題是導致 sendfile() 無法使用,而必須在 userland space 加密後改走回傳統的 write() 架構,這對於效能影響很大:

所以他們就讓 kernel 支援 AES 系列加密 (包括 AES-GCM 與 AES-CBC),效能的提昇大約是 30%:

Our changes in both the BoringSSL and ISA-L test situations significantly increased both CPU utilization and bandwidth over baseline - increasing performance by up to 30%, depending on the OCA hardware version.

文章開頭也有提到選 AES-GCM 與 AES-CBC 的一些來龍去脈,主要是 AES-GCM 的安全強度比較好,另外考慮到舊的 client 不支援 AES-GCM 時會使用 AES-CBC:

We evaluated available and applicable ciphers and decided to primarily use the Advanced Encryption Standard (AES) cipher in Galois/Counter Mode (GCM), available starting in TLS 1.2. We chose AES-CGM over the Cipher Block Chaining (CBC) method, which comes at a higher computational cost. The AES-GCM cipher algorithm encrypts and authenticates the message simultaneously - as opposed to AES-CBC, which requires an additional pass over the data to generate keyed-hash message authentication code (HMAC). CBC can still be used as a fallback for clients that cannot support the preferred method.

另外 OCA 機器本身也都夠新,支援 AES-NI 指令集,效能上不是太大的問題:

All revisions of Open Connect Appliances also have Intel CPUs that support AES-NI, the extension to the x86 instruction set designed to improve encryption and decryption performance. We needed to determine the best implementation of AES-GCM with the AES-NI instruction set, so we investigated alternatives to OpenSSL, including BoringSSL and the Intel Intelligent Storage Acceleration Library (ISA-L).

不過在「Netflix Open Connect Appliance Deployment Guide」(26 July 2016 版) 這份文件裡看起來還是用多條 10Gbps 透過 LACP 接上去:

You must be able to provision 2-4 x 10 Gbps ethernet ports in a LACP LAG per OCA. The exact quantity depends on the OCA type.

可能是下一版準備要上 40Gbps 或 100Gbps 的準備...?

HTML 的 preconnect

在「Preconnect」這邊看到 Preconnect 這個功能,目前在 Mozilla FirefoxGoogle Chrome 以及 Opera 都有支援,用法也很簡單:

<link rel="preconnect" href="//example.com">
<link rel="preconnect" href="//cdn.example.com" crossorigin>

感覺如果要用的話,可以先送出 head 的部分,打 flush 讓瀏覽器先收到後再送出其他部分?不過對 MVC 架構來說好像變複雜了,不知道有什麼設計比較好...

另外一個是 Prerender,目前是 IE11+、Google Chrome 以及 Opera 有支援,看起來也頗有趣的...

AWS 北京也支援 Direct Connect 了...

看到 AWS 官方的公告:「Announcing AWS Direct Connect for the China (Beijing) Region」。

這代表可以拉線路進 AWS 在北京的機房直連,另外可以參考「APN Partners Supporting AWS Direct Connect」的表格,可以知道要找哪些人,目前北京區看起來只有 Sinnet

Instagram 從 AWS 搬到 Facebook 機房

InstagramInstagram Engineering Blog 上宣佈的消息:「Migrating From AWS to FB」。

整個 migration 的過程是採取不停機轉移,所以 effort 比直接停機轉移高很多:

The main blocker to this easy migration was that Facebook’s private IP space conflicts with that of EC2. We had but one route: migrate to Amazon’s Virtual Private Cloud (VPC) first, followed by a subsequent migration to Facebook using Amazon Direct Connect. Amazon’s VPC offered the addressing flexibility necessary to avoid conflicts with Facebook’s private network.

先把整個系統轉移到 Amazon VPC 裡,然後再拉 AWS Direct Connect 串起來,接下來才是慢慢把 instance 轉移到 Facebook 的機房內。

中間也有一些工作:

To provide portability for our provisioning tools, all of the Instagram-specific software now runs inside of a Linux Container (LXC) on the servers in Facebook’s data centers.

所以已經導入 LXC 了...

Netflix 與 Comcast 的恩怨

其實就是商業公司之間的勾心鬥角,在包裝後搬到檯面上 :p

Netflix 在美國固網裡吃的流量比 YouTube 還多,可想而知當然就變成各 ISP 找麻煩的對象...


出自 Sandvine 的「Global Internet Phenomena Report 2H 2013」。

Netflix 有多種方式將影片傳遞給使用者。除了早期自建機房外,後來跟不少 CDN 有業務往來 (包括了 AkamaiLimelightLevel3),另外也有 Netflix Open Connect Content Delivery Network 計畫,直接在 ISP 內部機房放設備提供服務。

使用 CDN 的作法成本太高,而 ISP 又不一定會接受 Open Connect 方案 (因為不一定收的到錢),在這種情況下,如果走 transit 線路的速度通常都不會太好。而 Netflix 與 Comcast 之間的狀況就是如此:

在付給 Comcast 錢後速度都都解決了...

除了付錢解決外,上個禮拜 Netflix 就丟出一篇說明的文章發難了:「The Case Against ISP Tolls」,這篇文章除了提到上面的事情外,另外還極力反對 Comcast 與 Time Warner Cable 的併購案 XDDD

然後最近又炒熱的網路中立問題,看起來也這件案子應該會很熱鬧 XDDD