Home » 2018 » February (Page 3)

不同性質的應用程式對 KPTI (Meltdown 修正) 的效能影響

NetflixBrendan Gregg 整理了他測試 KPTI 對效能的影響:「KPTI/KAISER Meltdown Initial Performance Regressions」。

與其他人只是概括的測試,他主要是想要針對可量測的數字對應出可能的 overhead,這樣一來還沒上 patch 的人就可以利用這些量測數字猜測可能的效能衝擊。

他把結論放在前面:

To understand the KPTI overhead, there are at least five factors at play. In summary:

  • Syscall rate: there are overheads relative to the syscall rate, although high rates are needed for this to be noticable. At 50k syscalls/sec per CPU the overhead may be 2%, and climbs as the syscall rate increases. At my employer (Netflix), high rates are unusual in cloud, with some exceptions (databases).
  • Context switches: these add overheads similar to the syscall rate, and I think the context switch rate can simply be added to the syscall rate for the following estimations.
  • Page fault rate: adds a little more overhead as well, for high rates.
  • Working set size (hot data): more than 10 Mbytes will cost additional overhead due to TLB flushing. This can turn a 1% overhead (syscall cycles alone) into a 7% overhead. This overhead can be reduced by A) pcid, available in Linux 4.14, and B) Huge pages.
  • Cache access pattern: the overheads are exacerbated by certain access patterns that switch from caching well to caching a little less well. Worst case, this can add an additional 10% overhead, taking (say) the 7% overhead to 17%.

重點在於給了量測的方式,以第一個 Syscall rate 來說好了,他用 sudo perf stat -e raw_syscalls:sys_enter -a -I 1000 測試而得到程式的 syscall 數量,然後得到下面的表格,其中 X 軸是每秒千次呼叫數,Y 軸是效能損失:

用這樣的方式提供給整個組織 (i.e. Netflix) 內評估衝擊。

AWS 公開了在台北的 Direct Connect 接口

也是個很久前就聽到傳言的消息... AWS 剛剛公佈了在台北的 Direct Connect 接口,用戶可以在台北內租用線路進機房就連上 Direct Connect:「New AWS Direct Connect sites land in Paris and Taipei」。

AWS Direct Connect also launched its first site in Taiwan at Chief Telecom LY, Taipei. In the Management Console, Taipei is located in the Asia Pacific (Tokyo) Region. With global access enabled for AWS Direct Connect, these sites can reach AWS resources in any global AWS region using global public VIFs and Direct Connect Gateway.

以往需要透過像 GCX 這樣的公司租用國際頻寬,再從 GCX 在台北的機房拉到自家機房 (通常是市內專線),現在只需要在台北對接的這段就可以了。

台北的接點在 AWS 上寫的是 Chief Telecom LY, Taipei, Taiwan,查了一下是是方電訊的「是方麗源大樓 (台北市內湖區陽光街250號)」,也就是之前有發生過火災而大斷線的那棟。對接的 AWS Home Region 是 Asia Pacific (Tokyo),所以使用 ap-northeast-1 的人可以規劃...

AWS 推出 AWS Instance Scheduler,定時幫你開關機的服務...

第一眼在「Introducing the AWS Instance Scheduler」看到「AWS Instance Scheduler」的描述時,跟之前推出的 Scheduled Reserved Instances 搞混...

The AWS Instance Scheduler is a solution that enables customers to easily configure custom start and stop schedules for their Amazon EC2 and Amazon RDS instances. The solution is easy to deploy and can help reduce operational costs for both development and production environments. Customers who use this solution to run instances during regular business hours can save up to 70% compared to running those instances 24 hours a day.

以這張圖來說就更清楚,AWS Instance Scheduler 就是指定時間,定時幫你開關機的服務:

而我搞混的 Scheduled Reserved Instances 是買某個時段的 RI,是作帳議題。不過兩個看起來就很適合搭在一起用...

歐盟通過終結日光節約時間

看到歐盟通過終結日光節約時間的新聞:「Latest: European Parliament approves proposal to end bi-annual clock change」。

Fine Gael MEP Sean Kelly, who has been campaigning for the change, said: "I'm very pleased that after years of discussions at Committee level in the European Parliament, of which I'm the only Irish member, that out proposal was debated and voted on today in Parliament, and that Parliament accepted our proposal to ask the European Commission to come forward with a recommendation that we would end the bi-annual clock change."

其中藍色是目前還有在實施的地區,其他都是已經終止的:


取自「File:DaylightSaving-World-Subdivisions.png

主要是因為日光節約時間對於現代社會的好處愈來愈少的關係吧... 早期在歐美國家很盛行,現在歐洲決定廢止這個制度,應該會讓美國再次討論起來。

Amazon Route 53 的 Auto Naming API 可以指到 CNAME 位置了

Amazon Route 53 的 Auto Naming API 可以拿來跑 Service Discovery (參考先前的「用 Amazon Route 53 做 Service Discovery」這篇),當時是 A/AAAA/SRV record,現在則可以註冊 CNAME 了:「Amazon Route 53 Auto Naming Announces Support for CNAME Record Type and Alias to ELB」。

最直接的影響就是 ELB 的部份了,透過 ELB 處理前端的話,覆載平衡以及數量限制的問題就會減輕很多 (之前是靠 Round-robin DNS 打散,而且限制一次最多回應五個 record):

Beginning today, you can use the Amazon Route 53 Auto Naming APIs to create CNAME records when you register instances of your microservices, and your microservices can discover the CNAMEs by querying DNS for the service name. Additionally, you can use the Amazon Route 53 Auto Naming APIs to create Route 53 alias records that route traffic to Amazon Elastic Load Balancers (ELBs).

Amazon Aurora (MySQL) 推出相容於 MySQL 5.7 的版本

Amazon Aurora (MySQL) 推出相容於 MySQL 5.7 的版本了:「Amazon Aurora is Compatible with MySQL 5.7」。

不過網站上的介紹還沒更新:

Amazon Aurora is a relational database service that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. The MySQL-compatible edition of Aurora delivers up to 5X the throughput of standard MySQL running on the same hardware, and is designed to be compatible with MySQL 5.6, enabling existing MySQL applications and tools to run without requiring modification.

5.7 其中一個賣點在於支援 Spatial index (透過 R-tree),不過 AWS 的版本看起來是自己用 B-tree 加上 Z-order curve 實做:「Amazon Aurora under the hood: indexing geospatial data using Z-order curves」。

我覺得看看就好啦,拿 244GB RAM 的 r3.8xlarge 跑 1GB 的 data set,當大家是傻逼嗎...

Percona Server 引入 MyRocks

看到「MyRocks Engine: Things to Know Before You Start」這篇,才發現原來一月的時候 Percona Server 就已經將 MyRocks GA (General Availability) 了:「Percona Server for MySQL 5.7.20-19 Is Now Available」。

New Features:
Now MyRocks Storage Engine has General Availability status.

在二月這篇文章裡面有提到一些重點,像是安裝方式:

Now if you use Percona repositories, you can simply install MyRocks plugin and enable it with ps-admin --enable-rocksdb.

另外文章裡也提到了重要的差異 (在「What other differences should you be aware of?」這段),像是他並不是每個 table 都一個檔案,而是像早期 InnoDB 的作法,整個一包:

Let’s look at the directory layout. Right now, all tables and all databases are stored in a hidden .rocksdb directory inside mysqldir. The name and location can be changed, but still all tables from all databases are stored in just a series of .sst files. There is no per-table / per-database separation.

另外提到可以看到 Engine 的代碼是 ROCKSDB (從 ENGINE=ROCKSDB 那段看到的)。然後是 Isolation level 的支援度,只有 READ-COMMITTEDSERIALIZABLE,沒有 REPEATABLE-READ

Keep in mind that at this time MyRocks supports only READ-COMMITTED and SERIALIZABLE isolation levels. There is no REPEATABLE-READ isolation level and no gap locking like in InnoDB. In theory, RocksDB should support SNAPSHOT isolation level. However, there is no notion of SNAPSHOT isolation in MySQL so we have not implemented the special syntax to support this level. Please let us know if you would be interested in this.

然後 bulk load 在資料量超過記憶體大小時會有已知的 crash 問題:

For bulk loads, you may face problems trying to load large amounts of data into MyRocks (and unfortunately this might be the very first operation when you start playing with MyRocks as you try to LOAD DATA, INSERT INTO myrocks_table SELECT * FROM innodb_table or ALTER TABLE innodb_table ENGINE=ROCKSDB). If your table is big enough and you do not have enough memory, RocksDB crashes. As a workaround, you should set rocksdb_bulk_load=1 for the session where you load data.

然後目前沒有像 XtraBackup 的工具可以用,現階段如果要備份的話得透過傳統的方式來做 (mysqldump 或是 filesystem snapshot):

Right now there is no hot backup software like Percona XtraBackup to perform a hot backup of MyRocks tables (we are looking into this). At this time you can use mysqldump for logical backups, or use filesystem-level snapshots like LVM or ZFS.

想來找機會測試看看兩者差異...

Archives