Home » Posts tagged "dynamodb"

Amazon DynamoDB 的 Point-In-Time Recovery

Amazon DynamoDB 在 3/26 發出來的功能,以秒為單位的備份與還原機制:「New – Amazon DynamoDB Continuous Backups and Point-In-Time Recovery (PITR)」。

先打開這個功能:

打開後就會開始記錄,最多可以還原 35 天內的任何一個時間點的資料:

DynamoDB can back up your data with per-second granularity and restore to any single second from the time PITR was enabled up to the prior 35 days.

這時候就算改變資料或是刪除資料,實際上在系統內都是 Copy-on-write 操作,所以需要另外的空間,這部份會另外計價:

Pricing for continuous backups is detailed on the DynamoDB Pricing Pages. Pricing varies by region and is based on the current size of the table and indexes. For example, in US East (N. Virginia) you pay $0.20 per GB based on the size of the data and all local secondary indexes.

有這樣的功能通常是一開始設計時就有考慮 (讓底層的資料結構可以很方便的達成這樣的效果),現在只是把功能實作出來... 像 MySQL 之類的軟體就沒辦法弄成這樣 XDDD

最後有提到支援的地區,是用條列的而不是說所有有 Amazon DynamoDB 的區域都支援:

PITR is available in the US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (Sao Paulo) Regions starting today.

比對一下,應該是巴黎與美國政府用的區域沒進去... 一個是去年年底開幕的區域,另一個是本來上新功能就偏慢的區域。

DynamoDB 可以透過 KMS 加密了...

AWSDynamoDB 可以透過 KMS 加密了:「New – Encryption at Rest for DynamoDB」。

You simply enable encryption when you create a new table and DynamoDB takes care of the rest. Your data (tables, local secondary indexes, and global secondary indexes) will be encrypted using AES-256 and a service-default AWS Key Management Service (KMS) key.

看起來不是自己的 KMS key,而是 service 本身提供的,這樣看起來是在 i/o level 加密,所以還不是 searchable encryption 的能力...

Amazon DynamoDB 跨區 Replication 以及備份

Amazon DynamoDB 實做了全球性的 replication,以及備份功能:「Amazon DynamoDB Update – Global Tables and On-Demand Backup」。

跨區 replication 的功能讓每個 region 都可以存取當地機房的 DynamoDB:

Global Tables – You can now create tables that are automatically replicated across two or more AWS Regions, with full support for multi-master writes, with a couple of clicks. This gives you the ability to build fast, massively scaled applications for a global user base without having to manage the replication process.

這有點類似 GoogleCloud Spanner 在前陣子也推出全球性服務,但 DynamoDB 提供的比較偏向 NoSQL 而不是 RDBMS。

另外一個限制是跨區同步是 async,會有 replication lag 的問題:

Updates are propagated to other Regions asynchronously via DynamoDB Streams and are typically complete within one second (you can track this using the new ReplicationLatency and PendingReplicationCount metrics).

不過如果是這樣的機制,conflict 的問題不知道怎麼解決... 文章裡面沒看到。

然後目前支援的區域還是有限:

Global Tables are available in the US East (Ohio), US East (N. Virginia), US West (Oregon), EU (Ireland), and EU (Frankfurt) Regions today, with more Regions in the works for 2018.

另外一個是備份與還原機制,有這樣的功能對很多計畫方便不少:

On-Demand Backup – You can now create full backups of your DynamoDB tables with a single click, and with zero impact on performance or availability. Your application remains online and runs at full speed. Backups are suitable for long-term retention and archival, and can help you to comply with regulatory requirements.

而備份還原機制是陸陸續續開放的,區域也有限:

We are rolling this new feature out on an account-by-account basis as quickly as possible, with initial availability in the US East (Northern Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) Regions.

Amazon DynamoDB 也可以接在 VPC 內直接通了 (Public Preview)

AWS 宣布 Amazon DynamoDB 也可以透過 VPC Endpoints 接通了 (雖然是 public preview,表示要申請...):「Announcing VPC Endpoints for Amazon DynamoDB, Now in Public Preview」。

但看起來只有特定的區域才有,大家比較常用的 us-east-1、us-west-2 都不在本次範圍內:

  • Asia Pacific (Seoul)
  • Asia Pacific (Singapore)
  • Asia Pacific (Sydney)
  • Asia Pacific (Tokyo)
  • EU (Frankfurt)
  • South America (Sao Paulo)
  • US East (Ohio)
  • US West (N. California)

這樣讓內部完全無法連外的 private subnet 又多了一些應用可以跑。

Amazon DynamoDB Accelerator (DAX)

DynamoDB 推出的新架構,在系統上幫忙處理 cache:「Amazon DynamoDB Accelerator (DAX) – In-Memory Caching for Read-Intensive Workloads」。

DAX 跟現有的 DynamoDB API 相容:

DAX is a fully managed caching service that sits (logically) in front of your DynamoDB tables. It operates in write-through mode, and is API-compatible with DynamoDB.

因為 cache 的緣故,會是 eventually-consistent 架構:

Responses are returned from the cache in microseconds, making DAX a great fit for eventually-consistent read-intensive workloads.

然後是 r3 系列的機器組成的,限制在十台 (冒出大大的問號):

Each DAX cluster can contain 1 to 10 nodes; you can add nodes in order to increase overall read throughput. The cache size (also known as the working set) is based on the node size (dax.r3.large to dax.r3.8xlarge) that you choose when you create the cluster. Clusters run within a VPC, with nodes spread across Availability Zones.

不是很清楚這樣的好處 (比起自己用 memcached 或是其他類似的 cache 架構),也許過幾天想通了會開竅... :o

DynamoDB 推出 TTL 功能

DynamoDB 推出了依照時間自動刪除的功能:「New – Manage DynamoDB Items Using Time to Live (TTL)」。

You can enable this feature on a table-by-table basis, specifying an item attribute that contains the expiration time for the item.

這個功能比較特別的是,刪除的 scan 是不收取費用的:

There is no charge for the internal scan operation or for the deletion. You will pay for storage until the item is actually deleted.

用 Lambda 做 DynamoDB 的 Auto Scaling

AWS Lambda 可以跑 cron job 後應該就不怎麼意外出現了:「Autoscale DynamoDB provisioned capacity using Lambda」。

不像 EC2Auto Scaling,或是 ELB 自己會成長或縮小,DynamoDB 跟其他 AWS 服務不同,雖然可以 scale,但需要自己手動設定 capacity 伸縮。

於是就有人寫了程式 (也就是這個專案),判斷目前的 r/w 用量來決定策略... 有點像是我在處理自家 bandwidth 的搞法,達到某個警戒值就自動增加導去 CDN 的量,或是降低回來 :o

Archives