Home » Posts tagged "apache"

Instagram 解決 Cassandra 效能問題的方法

在解決 Cassandra 效能問題中大概就 ScyllaDB 特別有名,用 C++ 重寫一次使得效能大幅改善。而 Instagram 的人則是把底層的資料結構換掉,改用 RocksDB (這公司真的很愛自家的 RocksDB...):「Open-sourcing a 10x reduction in Apache Cassandra tail latency」。

主要原因是他們發現 Cassandra 在處理資料的部份會有 JVM 的 GC 問題,而且是導致 Cassandra 效能差的主要原因:

Apache Cassandra is a distributed database with it’s own LSM tree-based storage engine written in Java. We found that the components in the storage engine, like memtable, compaction, read/write path, etc., created a lot of objects in the Java heap and generated a lot of overhead to JVM.

然後在換完後測試可以看到效能大幅提昇,也可以看到 GC 的延遲大幅降低:

In one of our production clusters, the P99 read latency dropped from 60ms to 20ms. We also observed that the GC stalls on that cluster dropped from 2.5% to 0.3%, which was a 10X reduction!

比較一下這兩者的差異:在 ScyllaDB 是全部都用 C++ 改寫 (資料結構不換),這樣就直接解決掉 JVM 的 GC 問題。在 Rocksandra 則是在 profiling 後挑重點換掉 (這邊看起來是處理資料的 code,直接換成 RocksDB),另外順便把一些界面抽象化... 兩個不一樣的解法,都解決了 JVM 的 GC 問題。

從 Cassandra 到 ScyllaDB 的轉移方式好像跟以前不太一樣了...

在「New Docs: Four Phases to Migrate from Apache Cassandra to Scylla」這邊看到 ScyllaDB 官方提供 Cassandra 轉移到 ScyllaDB 的說明,跟以前好像差蠻多的...

以前 ScyllaDB 可以直接加入到 Cassandra 的 cluster (一時間沒找到資料,但在「can not add node with cassandra ami · Issue #107 · scylladb/scylla-cluster-tests」可以看到當時的痕跡),現在給的方法是在資料庫不相容時的轉移方式 (像是從 MySQL 轉換到 PostgreSQL 這種),是暗示已經沒辦法這樣做了嗎?

不過從 GitHub 上的 wiki page 看起來,底層資料與 protocol 應該還是相容的,才能做直接複製資料的 offline migration:「Migrating Cassandra data to Scylla」。

也有可能這篇只是寫手隨意寫的文章,沒有把 ScyllaDB 的優勢展現出來...

About John Hammink
John Hammink is a writer and content creator at ScyllaDB. With more than 20 years in technology, he's also a touring/studio musician, digital artist and speaker.

CUPS 從 GPLv2 變成 Apache License, Version 2.0 了

CUPS 是處理印表機的軟體,在 macOS 以及其他各種 Unix-like 環境下都會使用。

在「CUPS relicensed to Apache v2」這邊看到 relicense 的消息,正式的公告則是在「CUPS License Change Coming」這邊可以看到:

Apple is excited to announce that starting with CUPS 2.3 we will be providing CUPS under the terms of the Apache License, Version 2.0.

剛好 GPLv2Apache License, Version 2.0 之間不相容,這樣跳過去算是趣味趣味...

Apache 的 Optionsbleed

Apache 也出了類似 Heartbleed 的包:「Apache bug leaks contents of server memory for all to see—Patch now」,原文出自「Optionsbleed - HTTP OPTIONS method can leak Apache's server memory」。

這掛上 CVE-2017-9798 了,影響版本包括了:

This affects the Apache HTTP Server through 2.2.34 and 2.4.x through 2.4.27.

發生在對 OPTIONS 處理出問題:

Optionsbleed is a use after free error in Apache HTTP that causes a corrupted Allow header to be constructed in response to HTTP OPTIONS requests. This can leak pieces of arbitrary memory from the server process that may contain secrets. The memory pieces change after multiple requests, so for a vulnerable host an arbitrary number of memory chunks can be leaked.

就... 更新吧 @_@

React 的專利授權議題

ASF (Apache Software Foundation) 全面禁止 Facebook 的 BSD+PATENTS 後 (「Apache Foundation 宣佈禁止使用 Facebook BSD+Patents 的軟體」),整件事情開始熱起來了...

簡單來說,Facebook 有意為之,而且不打算撤回這個有攻擊性的授權模式,參考「Explaining React's license」這邊官方的說明以及有人寫了一篇解讀:「If you’re a startup, you should not use React (reflecting on the BSD + patents license)」。

Facebook 內的意見其實也不一樣,像是 Yarn 之所以沒有 PATENTS 是因為爭取出來的:


ScyllaDB 2.0 要引入 Cassandra 3.0+ 的 Materialized View

最近 ScyllaDB 的網站改版了... (有種不習慣的感覺)

ScyllaDB 2.0 打算要引入 Materialized View (出自 Apache Cassandra 3.0+):「Materialized Views preview in Scylla 2.0」。

一般 Materialized View 的實做方式是另外存一份,所以你可以在上面加 Index 之類的設定讓存取速度變快...

不過 Cassandra 不是本來就以讀慢寫快為優勢嗎,要速度可以考慮用 cache 疊出來,或是其他方式,當初 Cassandra 會開發這個功能就有點... XDDD

Apache Foundation 宣佈禁止使用 Facebook BSD+Patents 的軟體

在「RocksDB Integrations」這邊討論到 RocksDBFacebook 所使用的 Facebook BSD+Patents License。

不過因為 RocksDB 最近在換 license (從 Facebook BSD+Patents 換到 Apache License, Version 2.0),移除了 PATENTS 內的限制,需要看 PATENTS 的舊檔案可以在 PATENTS 這邊看到。

Chris Mattmann 正式發出決議禁用 Facebook BSD+Patents License。(參考最後)

另外也提到了 Facebook 是故意埋下這些限制:

Note also Roy's comment that he has discussed the matter with FB's counsel and the word is that the FB license is intentionally incompatible. It is hard to make the argument that it is compatible after hearing that. Pragmatically speaking, regardless of any semantic shaving being done, having a statement like that from the source of the license is very daunting. If they think it is incompatible, we need to not try to wheedle and convince ourselves it is not.

這個 license 之後應該會有更多挑戰...


As some of you may know, recently the Facebook BSD+patents license has been
moved to Category X (https://www.apache.org/legal/resolved#category-x).
Please see LEGAL-303 [1] for a discussion of this. The license is also referred
to as the ROCKSDB license, even though Facebook BSD+patents is its more
industry standard name.

This has impacted some projects, to date based on LEGAL-303
and the detective work of Todd Lipcon:

Samza, Flink, Marmotta, Kafka and Bahir

(perhaps more)

Please take notice of the following policy:

o No new project, sub-project or codebase, which has not
  used Facebook BSD+patents licensed jars (or similar), are allowed to use
  them. In other words, if you haven't been using them, you
  aren't allowed to start. It is Cat-X.

o If you have been using it, and have done so in a *release*,
  you have a temporary exclusion from the Cat-X classification thru
  August 31, 2017. At that point in time, ANY and ALL usage
  of these Facebook BSD+patents licensed artifacts are DISALLOWED. You must
  either find a suitably licensed replacement, or do without.
  There will be NO exceptions.

o Any situation not covered by the above is an implicit
  DISALLOWAL of usage.

Also please note that in the 2nd situation (where a temporary
exclusion has been granted), you MUST ensure that NOTICE explicitly
notifies the end-user that a Facebook BSD+patents licensed artifact exists. They
may not be aware of it up to now, and that MUST be addressed.

If there are any questions, please ask on the legal-discuss@a.o


Chris Mattmann
VP Legal Affairs

[1] https://issues.apache.org/jira/browse/LEGAL-303

ScyllaDB 1.7 支援 Counters 了

在「Scylla release: version 1.7」這邊看到 ScyllaDB 支援 Counters 的消息了 (雖然剛出來,掛著 Experimental 的消息):

Scylla now supports Counters as a native type. A counter column is a column whose value is a 64-bit signed integer and on which two operations are supported: incrementing and decrementing.

這其實是 Cassandra 其中一個強項,針對 counter 這種應用特化的資料型態。

OpenSSL 將轉為 Apache 2.0 License

OpenSSL 最近打算把原本的 license 換成 Apache License, Version 2.0:「Licensing Update」。

主要的原因是希望相容於現有大多數的 open source project:

OpenSSL Re-licensing to Apache License v. 2.0 To Encourage Broader Use with Other FOSS Projects and Products

但這非常詭異啊,舊的 license 最大的問題就是與 GPLv2 不相容,而預定要換的 AL 2.0 也還是不相容啊,搞屁啊。