Home » Computer » Network » Archive by category "CDN" (Page 4)

Cloudflare 推出 SSL 的 SaaS 服務

Cloudflare 把之前需要人力介入的操作,包成 API 直接提供給大家用:「Introducing SSL for SaaS」。

要解決的是「用戶自己的 domain 掛到我的服務上,而且需要 HTTPS」,而 Cloudflare 提供 API 把上面三部跑完,下面的就是 Cloudflare 自己處理 (包括取得用戶的同意):

Cloudbleed:Cloudflare 這次的安全問題

Cloudflare 把完整的時間軸與影響範圍都列出來了:「Incident report on memory leak caused by Cloudflare parser bug」。

出自於 2/18 時 GoogleTavis Ormandy 直接在 Twitter 上找 Cloudflare 的人:

Google 的 Project Zero 上的資料:「cloudflare: Cloudflare Reverse Proxies are Dumping Uninitialized Memory」。

起因在於 bug 造成有時候會送出不應該送的東西,可能包含了敏感資料:

It turned out that in some unusual circumstances, which I’ll detail below, our edge servers were running past the end of a buffer and returning memory that contained private information such as HTTP cookies, authentication tokens, HTTP POST bodies, and other sensitive data.

不過這邊不包括 SSL 的 key,主要是因為隔離開了:

For the avoidance of doubt, Cloudflare customer SSL private keys were not leaked. Cloudflare has always terminated SSL connections through an isolated instance of NGINX that was not affected by this bug.

不過由於這些敏感資料甚至還被 Google 收進 search engine,算是相當的嚴重,所以不只是 Cloudflare 得修好這個問題,還得跟眾多的 search engine 合作將這些資料移除:

Because of the seriousness of such a bug, a cross-functional team from software engineering, infosec and operations formed in San Francisco and London to fully understand the underlying cause, to understand the effect of the memory leakage, and to work with Google and other search engines to remove any cached HTTP responses.

bug 影響的時間從 2016/09/22 開始:

2016-09-22 Automatic HTTP Rewrites enabled
2017-01-30 Server-Side Excludes migrated to new parser
2017-02-13 Email Obfuscation partially migrated to new parser
2017-02-18 Google reports problem to Cloudflare and leak is stopped

而以 2/13 到 2/18 的流量反推估算,大約是 0.00003% 的 request 會可能產生這樣的問題:

The greatest period of impact was from February 13 and February 18 with around 1 in every 3,300,000 HTTP requests through Cloudflare potentially resulting in memory leakage (that’s about 0.00003% of requests).

不過不得不說 Tavis Ormandy 真的很硬,在沒有 source code 以及 Cloudflare 幫助的情況下直接打出可重製的步驟:

I worked with cloudflare over the weekend to help clean up where I could. I've verified that the original reproduction steps I sent cloudflare no longer work.


2017-02-18 0011 Tweet from Tavis Ormandy asking for Cloudflare contact information
2017-02-18 0032 Cloudflare receives details of bug from Google
2017-02-18 0040 Cross functional team assembles in San Francisco
2017-02-18 0119 Email Obfuscation disabled worldwide
2017-02-18 0122 London team joins
2017-02-18 0424 Automatic HTTPS Rewrites disabled worldwide
2017-02-18 0722 Patch implementing kill switch for cf-html parser deployed worldwide
2017-02-20 2159 SAFE_CHAR fix deployed globally
2017-02-21 1803 Automatic HTTPS Rewrites, Server-Side Excludes and Email Obfuscation re-enabled worldwide

另外在「List of Sites possibly affected by Cloudflare's #Cloudbleed HTTPS Traffic Leak」這邊有人整理出受影響的大站台有哪些 (小站台就沒列上去了)。

Cloudflare 因為閏秒炸掉...

Cloudflare 這次閏秒炸掉:「How and why the leap second affected Cloudflare DNS」,影響範圍包括了 DNS query 與 HTTP request:

At peak approximately 0.2% of DNS queries to Cloudflare were affected and less than 1% of all HTTP requests to Cloudflare encountered an error.

主要的原因在於 Gotime.Now() 不保證遞增:

RRDNS is written in Go and uses Go’s time.Now() function to get the time. Unfortunately, this function does not guarantee monotonicity. Go currently doesn’t offer a monotonic time source (see issue 12914 for discussion).


In this patch we allowed RRDNS to forget about current upstream performance, and let it normalize again if time skipped backwards.

應該是因為 Cloudflare 這段程式還沒遇過 leap second 造成的...

CloudFront 的 Regional Edge Caches

Amazon CloudFront 前陣子宣佈了 two-tier 架構:「Announcing Regional Edge Caches for Amazon CloudFront」。

一般的 CDN 是 edge 收到後就打到 origin,這會使得 origin 的量比較大。而 two-tier 架構則是在中間疊一層降低對 origin 的量。這種架構對於直播時的 pattern 幫助很大:由於量很大,會需要用大量的 edge server 支撐,而 edge server 一多就對 origin 產生巨大的壓力。

一般直播 95% 的 hitrate 表示外面 20Gbps 的流量就會造成 origin 1Gbps 的流量,通常用 two-tier 可以拉到 98%+ (CDN vendor 有調整過可以到 99%+)。

這種技術在 Akamai 叫 Tiered Distribution,在 EdgeCast 叫 SuperPoP,而現在 CloudFront 也推出了,叫做 Regional Edge Cache:

The nine new Regional Edge Cache locations are in Northern Virginia, Oregon, São Paulo, Frankfurt, Singapore, Seoul, Tokyo, Mumbai, and Sydney.

edge 會先到這幾個 regional edge 再到 origin:

These locations sit between your origin webserver and the 68 global edge locations that serve traffic directly to your viewers.


Regional Edge Caches are turned on by default for your CloudFront distributions; you do not need to make any changes to your distributions to take advantage of this feature. There are also no additional charges to use this feature.

不過這個架構對於 latency 應該不會太好,沒得關閉有點奇怪...

在 CloudFront 的 edge 上跑 Lambda

所以 Amazon CloudFront 讓使用者在 edge 上跑程式了 (雖然目前是 limited preview):「Lambda@Edge – Preview」。

分成 Viewer Request、Origin Request、Origin Response 以及 Viewer Response 四個階段可以插入修改。另外有些限制:

Because your JavaScript code will be part of the request/response path, it must be lean, mean, and self-contained. It cannot make calls to other web services and it cannot access other AWS resources. It must run within 128 MB of memory, and complete within 50 ms.

要在 128MB 內搞定,而且不能呼叫其他資源。不過這樣已經可以做很多事了... 基本上就是一台 turing machine 了 :o