Let's Encrypt 支援 IDN

Let's Encrypt 宣佈支援 IDN:「Introducing Internationalized Domain Name (IDN) Support」,這代表可以申請的範圍變得更廣了:

This means that our users around the world can now get free Let’s Encrypt certificates for domains containing characters outside of the ASCII set, which is built primarily for the English language.

在「Upcoming Features」可以看到下一步應該是 ECDSA Intermediates?

Let’s Encrypt only signs end-entity certificates with RSA intermediates. We will add the ability to have end-entity certs signed by an ECDSA intermediate.

不曉得之後還會有什麼功能...

QUIC 的進展

在「New Work in Seoul」這邊看到 QUIC 的消息:

The QUIC working group has just been chartered, and will meet for the first time in Seoul. This working group is taking Google’s pre-standardization QUIC protocol that has been deployed in production for several years, and will use it as a starting point to develop a UDP-based, stream-multiplexing, encrypted transport protocol with standardized congestion control, TLS 1.3 by default, a mapping for HTTP/2 semantics over QUIC, and multipath extensions. This is the IETF’s first standardized always-encrypted transport protocol, so careful consideration of applicability and operational capabilities will be key for success.

IETFDatatracker 上也可以看到記錄了:「QUIC (quic)」,最下面的 Milestones 可以看到第一階段的目標是在明年二月把基本的協定都定下來,之後再加東西上去。

玩 Python 下的 ggplot

在「A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair)」這邊又再次看到 Python 下的 ggplot,以為還算好裝,但實際上好像有點難裝 XD

我平常用的環境是 pyenv 跑 Python 3.5.2。而跑 ggplot 需要用到 _tkinter,這個模組,而這個模組在 Python 3 應該是內建的... 只要你有先裝 tk-dev @_@

所以在弄了半天發現這個問題後,先把 tk-dev 補裝上,再重新安裝 Python 3.5.2:

$ sudo apt-get install tk-dev
$ pyenv install -f 3.5.2
$ pip install -U ggplot

裝好後發現網路上一般建議的寫法好像不會動,又摸了一陣子後發現現在變得物件化了,要改成這樣的方式把檔案存起來:

p = ggplot(...) + ...
p.save('a.png')

另外資料的物件要透過 DataFrame 產生出來,反正不少枚枚角角的細節要了解後才知道怎麼用 @_@

Anyway,程式碼可以在 population-taiwan.py 這邊翻到,人口資料則是從中文維基百科的「臺灣人口普查」這邊拉出來的,最後產生出來的圖片會是這樣:

算是牛刀小試... 話說 theme_xkcd() 效果頗不賴 XDDD

Pinterest 在 InnoDB Compression 的努力

Pinterest 用 InnoDB 儲存各式資料,而且使用了 InnoDB Compression 的功能。他們花了不少力氣跟 Percona 合作改善 InnoDB Compression 的效能:「Evolving MySQL Compression - Part 1」。

文章有點長度,重點在於他們在 MySQL 裡面放了大量的 JSON:

A Pin is stored as a 1.2 KB JSON blob in sharded MySQL databases.

他們發現新版 zlib 的 predefined dictionary 可以讓壓縮率變得更高 (從本來的 ~50% 到 ~66%);而除了壓縮率變高外,由於事先定義了字典內容,對於效能的提昇也不少 (warm up):

Zlib version 1.2.7.1 was released in early 2013 and added the ability to use a predefined “dictionary” to prefill the lookback window for LZ77. This seemed promising since we could “warm up” the lookback window with field names and other common strings. We ran a few tests using the Python Zlib library with a naive predefined dictionary consisting of an arbitrary Pin JSON blob. The compression savings increased from ~50% to ~66% at what appeared to be relatively little cost.

另外他們做了 read-only 的 benchmark (畢竟這是重點)。圖片資料有點糊,但可以看出 y 軸是 Queries/sec。而 x 軸上則用文字給了些說明,黃色是 TokuDB,紅色是本來的 InnoDB Compression,剩下的都是不同的字典集的成果:

Below is a graph from our presentation which showed a read-only version of our production workload at concurrency of 256, 128, 32, 16, 8, 4 and 1 clients. TokuDB is in yellow, InnoDB page compression is in red and the other lines are column compression with a variety of dictionaries.

整體效率都比之前高不少,尤其是當 concurrent query 的數量偏高的時候差距會很大。

而這個功能將會納入未來的 Percona 版本,對於在 MySQL 裡面會塞 JSON 或是 XML 的人應該會很有幫助:

We worked with Percona to create a specification for column compression with an optional predefined dictionary and then contracted with Percona to build the feature.

nginx 1.10.2

之前在「谈谈 Nginx 的 HTTP/2 POST Bug」這邊提到了 nginx 的一個 bug:「當 HTTP/2 的第一個 request 是 POST 時連線會失敗」的問題,這個問題在 mainline 版本的 1.11.0 解決了,但 stable 版一直沒有出新版 back-porting 回來。

而剛剛看到 1.10.2 將 http2_body_preread_size 從 mainline 版本弄回來解決了:「[nginx-announce] nginx-1.10.2」。

*) Change: HTTP/2 clients can now start sending request body
   immediately; the "http2_body_preread_size" directive controls size of
   the buffer used before nginx will start reading client request body.

然後剛剛發現 Ondřej Surý 老大分別弄出了 nginx (stable 版本) 與 nginx-mainline (mainline 版本) 的 PPA,所以也可以考慮可以直接換到 mainline 上?這樣也是個方法...

Docker 的權限控制

Red Hat Enterprise Linux Blog 上整理了一篇關於 Docker 目前支援的權限控制:「Secure Your Containers with this One Weird Trick」,目前有 38 個權限可以控制:

Originally the kernel allocated a 32-bit bitmask to define these capabilities. A few years ago it was expanded to 64. There are currently around 38 capabilities defined.

這對於跑一些應用來說還頗不錯的,像是之前提到的「用 Docker 跑 Skype 講電話」,可以再縮限一些權限 :o

AWS 美東第二區開放

如同之前 AWS 的規劃,宣佈啟用美東第二區了 (us-east-2,在俄亥俄州):「Now Open – AWS US East (Ohio) Region」。看了一下 EC2 的價錢,與 us-east-1 是同一個級別的,其他的服務應該是都差不多...

另外因為跨州了 (而且跟 us-east-1 很近),所以官方也推薦拿來做異地服務:

With just 12 ms of round-trip latency between US East (Ohio) and US East (Northern Virginia), you can make good use of unique AWS features such as S3 Cross-Region Replication, Cross-Region Read Replicas for Amazon Aurora, Cross-Region Read Replicas for MySQL, and Cross-Region Read Replicas for PostgreSQL.

其中有個特別的地方在於 us-east-{1,2} 之間傳輸的費用會以 Inter-AZ 計費,而非以跨 region 計費。大概是希望讓大家有動力多放些東西過去,畢竟 us-east-1 實在太大,穩定性超有名的關係 XDDD:

Data transfer between the two Regions is priced at the Inter-AZ price ($0.01 per GB), making your cross-region use cases even more economical.

Ubuntu 上 ttf-fireflysung 的 PPA

目前只能在 2012/02/04 的「Index of /fonts/FireFly」與 2015/03/20 的「Index of /apt/firefly-font」找到 1.3.0 的蹤跡,而 FreeBSD Ports 裡的 chinese/ttf-fireflyttf 則是又修正了一個版本 (1.3.0p1),把一些字型修正掉了:

將『角』部首及偏旁的字修改成教育部標準寫法。
Edward G.J. Lee

找了找 Ubuntu 下有沒有現成的套件,由於沒看到對應的版本,就決定自己做一份出來了:「PPA for ttf-fireflysung」,然後在 16.10 的 Yakkety Yak 也支援了...

這個由文鼎當初捐贈的字型而修出來的版本還是頗好用的...

MySQL 8.0 將會實作「真正的」Descending Indexes

在「MySQL 8.0 Labs – Descending Indexes in MySQL」這邊看到 MySQL 打算在 8.0 時實作出真正的 Descending Indexes。在 5.7 以及之前的版本,可以從「14.1.14 CREATE INDEX Syntax」看到這個參數是~假~的~XDDD

An index_col_name specification can end with ASC or DESC. These keywords are permitted for future extensions for specifying ascending or descending index value storage. Currently, they are parsed but ignored; index values are always stored in ascending order.

所以當 8.0 建立了 a_desc_b_asc (a DESC, b ASC) 這樣的 index,可以看到對於不同 ORDER BY 時效能的差異:(一千萬筆資料)

有些變快可以理解,但有些結果不太清楚造成的原因...

Anyway,對於變慢的兩個 query,他提了一個不算解法的解法,就是加上對應的 index XDDD:

If user wants to avoid filesorts for Query 5 and Query 6, he/she can alter the table to add a key (a ASC, b ASC) . Further to this, if the user wants to avoid backward index scans too, he/she can add both ( a ASC, b DESC) and (a DESC, b DESC).

這樣就會變快,但寫入的 overhead 會增加啊... XD

但不管怎樣,總算是把這個功能生出來了...