Home » Posts tagged "cluster"

Amazon Elasticsearch 支援 I3 instance (i.e. 1.5 PB Disk) 了

Amazon Elasticsearch 支援 I3 instance 了:「Run Petabyte-Scale Clusters on Amazon Elasticsearch Service Using I3 instances」。

Amazon Elasticsearch Service now supports I3 instances, allowing you to store up to 1.5 petabytes of data in a single Elasticsearch cluster for large log analytics workloads.

i3.16xlarge 單台是 15.2 TB 的硬碟空間,100 台就會是 1.5 PB,不知道跑起來會多慢 XDDD

Amazon Elasticsearch Service – Amazon Web Services (AWS) | FAQs 這邊還沒修正 XD:

You can request a service limit increase up to 100 instances per domain by creating a case with the AWS Support Center. With 100 instances, you can allocate about 150 TB of EBS storage to a single domain.

Amazon EKS 與 AWS Fargate

在今年的 AWS re:Invent 2017 上宣佈 Amazon ECS 也支援 Kubernetes,也就是 Amazon EKS:「Amazon Elastic Container Service for Kubernetes」,一個用的人夠多就支援的概念...

目前這個服務還在 Preview,所以要申請才能用:

Amazon EKS is available in Preview. We look forward to hearing your feedback.

另外一個在 AWS re:Invent 2017 上宣佈的是 AWS Fargate,讓你連 Amazon ECS 或是 Amazon EKS 都不用管的服務,直接按照 container 的大小收費:「Introducing AWS Fargate – Run Containers without Managing Infrastructure」、「AWS Fargate: A Product Overview」。

第一個有疑慮的點是,是否會跟其他人共用相同的 host,也就是 isolation 的程度。這點在 AWS 的人在 Hacker News 上的這邊有回覆,在不同的 cluster 上不會使用同樣的底層:

NathanKP 4 days ago [-]
Fargate isolation is at the cluster level. Apps running in the same cluster may share the underlying infrastructure, apps running in different clusters won't.

另外也提到每個 cluster 都是使用者自己產生的:

NathanKP 3 days ago [-]
A customer creates a cluster on their account. You as a customer can create one or more Fargate clusters on your account to launch your containers in.

不是很正面的回覆,而且不是在官方的 forum 回的,安全性就要大家自己判斷了...

另外也有有提到與 Amazon EC2 相比,價錢當然會比較貴,但可以預期會降低 engineer 的時間成本:

NathanKP 4 days ago [-]
AWS employee here. Just want to say that we actually had a typo in the per second pricing on launch. The actual pricing is:
$0.0506 per CPU per hour
$0.0127 per GB of memory per hour
Fargate is definitely more expensive than running and operating an EC2 instance yourself, but for many companies the amount that is saved by needing to spend less engineer time on devops will make it worth it right now, and as we iterate I expect this balance to continue to tip. AWS has dropped prices more than 60 times since we started out.

目前只能接 Amazon ECS,預定 2018 可以接 Amazon EKS:

I will tell you that we plan to support launching containers on Fargate using Amazon EKS in 2018.

而目前這個版本 (可以接 Amazon ECS 的版本) 在 us-east-1 已經開放了:

Fargate is available today in the US East (Northern Virginia) region.

Percona 分析在 AWS 上跑 Percona XtraDB Cluster 的效能 (I/O bound)

Percona 的人分析了在 Amazon EC2 上跑 Percona XtraDB Cluster (PXC) 效能 (I/O bound):「Best Practices for Percona XtraDB Cluster on AWS」。


直接跳到結論的地方。如果資料可以掉,用 i3 本地 storage 的效能是最好的,如果要資料不能掉,用 EBS 的 Provisioned IOPS SSD (io1) 的效能會比 General Purpose (gp2) 好很多。

另外 instance type 的選擇上,避免用 {i3,r4}.large,因為測試出來發現 {i3,r4}.xlarge 的效能好不只一倍。

不過 Aurora 的 Multi-master 已經在 Preview 了啊,如果 Percona 的人拿到帳號的話,應該會有單位成本的效能比較可以看...

Amazon EC2 可以設定 Spread Placement Group 要求打散機器

Amazon EC2 推出的新設計 Spread Placement Group,用來打散 instance:「Introducing Spread Placement Groups for Amazon EC2」。

本來的 Cluster Placement Group 是將機器集中在一起,對於 latency 極為敏感的應用會有幫助 (像是下面有提到 HPC 應用);這次推出的 Spread Placement Group 則是要求跑在不同機器上,打散降低風險:

Spread placement groups help reduce the likelihood of failures within clusters or groups of instances. Amazon EC2 has had cluster placement groups, which enable applications to get the low-latency network performance necessary for tightly-coupled node-to-node communication typical of many HPC applications. Now with spread placement groups, member instances will be placed on distinct hardware, reducing the impact of hardware failures on your applications.


Spread placement groups are available in all AWS regions. To get started, visit the AWS Management Console, AWS Command Line Interface (CLI), and AWS SDKs.

Amazon Aurora 的 Serverless 與 Multi-master

Amazon Aurora 推出了兩包玩意,第一包是 Serverless,讓需要人介入的情況更少:「In The Works – Amazon Aurora Serverless」。

在 Serverless 的第一個重點是支援以秒計費:

Today we are launching a preview (sign up now) of Amazon Aurora Serverless. Designed for workloads that are highly variable and subject to rapid change, this new configuration allows you to pay for the database resources you use, on a second-by-second basis.

然後是極為快速的 auto-scaling:

The endpoint is a simple proxy that routes your queries to a rapidly scaled fleet of database resources. This allows your connections to remain intact even as scaling operations take place behind the scenes. Scaling is rapid, with new resources coming online within 5 seconds

這兩個組合起來,讓使用端可以除了在 Amazon EC2 上可以快速 scale 外,後端的資料庫也能 scale 了...

第二個是 Multi-master 架構:「Sign Up for the Preview of Amazon Aurora Multi-Master」。

Amazon Aurora Multi-Master allows you to create multiple read/write master instances across multiple Availability Zones. This enables applications to read and write data to multiple database instances in a cluster, just as you can read across Read Replicas today.

(話說我一直都誤以為 Aurora 是 R/W master...)

Anyway,這個功能不知道怎麼疊上去的... 不笑得會不會有嚴重的 distributed lock issue,反而推薦大家平常都寫到同一台 (像是 PXC 就會這樣)。

從 Cassandra 到 ScyllaDB 的轉移方式好像跟以前不太一樣了...

在「New Docs: Four Phases to Migrate from Apache Cassandra to Scylla」這邊看到 ScyllaDB 官方提供 Cassandra 轉移到 ScyllaDB 的說明,跟以前好像差蠻多的...

以前 ScyllaDB 可以直接加入到 Cassandra 的 cluster (一時間沒找到資料,但在「can not add node with cassandra ami · Issue #107 · scylladb/scylla-cluster-tests」可以看到當時的痕跡),現在給的方法是在資料庫不相容時的轉移方式 (像是從 MySQL 轉換到 PostgreSQL 這種),是暗示已經沒辦法這樣做了嗎?

不過從 GitHub 上的 wiki page 看起來,底層資料與 protocol 應該還是相容的,才能做直接複製資料的 offline migration:「Migrating Cassandra data to Scylla」。

也有可能這篇只是寫手隨意寫的文章,沒有把 ScyllaDB 的優勢展現出來...

About John Hammink
John Hammink is a writer and content creator at ScyllaDB. With more than 20 years in technology, he's also a touring/studio musician, digital artist and speaker.

Reddit 在處理 Page View 的方式

Reddit 說明了他們如何處理 pageview:「View Counting at Reddit」。

以 Reddit 的規模有提到兩個重點,第一個在善用 RedisHyperLogLog 這個資料結構,當量大的時候其實可以允許有微小的誤差:

The amount of memory varies per implementation, but in the case of this implementation, we could count over 1 million IDs using just 12 kilobytes of space, which would be 0.15% of the original space usage!

維基百科上有說明當資料量在 109 這個等級時,用 1.5KB 的記憶體只有 2% 的誤差值:

The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical error rate of 2%, using 1.5 kB of memory.

第二個則是寫入允許短時間的誤差 (pageview 不會即時反應),透過批次處理降低對 Cassandra cluster 的負荷:

Writes to Cassandra are batched in 10-second groups per post in order to avoid overloading the cluster.

可以注意到把 Redis 當作 cache 層而非 storage 層。

主要原因應該跟 Redis 定位是 data structure server 而非 data structure storage 有關 (可以從對 Durability 的作法看出來),而使用 Cassandra 存 key-value 非常容易 scale,但讀取很慢。剛好兩個相輔相成。

Percona XtraDB Cluster 解釋最近的版本為什麼可以再度提昇效能

Percona XtraDB Cluster 最近的一個版本宣稱效能大幅提昇,這篇試著解釋原因:「How We Made Percona XtraDB Cluster Scale」。


With these three main optimizations, and some small tweaks, we have tuned Percona XtraDB Cluster to scale better and made it fast enough for the growing demands of your applications. All of this is available with the recently released Percona XtraDB Cluster 5.7.17-29.20. Give it a try and watch your application scale in a multi-master environment, making Percona XtraDB Cluster the best fit for your HA workloads.