Amazon S3 推出了一個自動分析後分類的 Storage Class

Amazon S3 推出了新的 Storage Class,後面直接用演算法分析 access pattern (所以要跑一陣子才會生效),然後決定要放到 Standard 或是 Standard IA 裡:「Announcing S3 Intelligent-Tiering — a New Amazon S3 Storage Class」。

混了 Standard 與 Standard IA:

S3 Intelligent-Tiering stores objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access.

然後連續 30 天沒有被存取的就會被丟到 Standard IA,如果有被存取的話就會被搬回來,而搬移的部份不用收費:

For a small monthly monitoring and automation fee per object, S3 Intelligent-Tiering monitors access patterns and moves objects that have not been accessed for 30 consecutive days to the infrequent access tier. There are no retrieval fees in S3 Intelligent-Tiering. If an object in the infrequent access tier is accessed later, it is automatically moved back to the frequent access tier. No additional tiering fees apply when objects are moved between access tiers within the S3 Intelligent-Tiering storage class.

從費用上可以看到演算法本身是有費用的,換算一下 1M objects 是 USD$2.5/month,好像還可以...

Monitoring and Automation, All storage / Month $0.0025 per 1,000 objects

不過有蠻多要注意的 pattern。像是這邊有提到 128KB 以下的檔案不會搬到 IA 上,但不知道算不算 Monitoring 的費用?

S3 Intelligent-Tiering has a minimum eligible object size of 128KB for auto-tiering. Smaller objects may be stored but will always be charged at the Frequent Access tier rates.

另外這邊講 S3 Intelligent-Tiering 的三十天也不知道是不是 Standard + Standard IA,或是分開算:

S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA storage are charged for a minimum storage duration of 30 days.

可以先觀望一下...

Amazon S3 推出 SFTP

Amazon 居然推出對 Amazon S3SFTP 服務了:「Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3」。

這個服務是掛在 AWS Transfer 的名義下 (AWS Transfer for SFTP),這對老系統來說可以省一些事情,不過目前還不支援固定 IP,這樣不太能直接搬上去 (這種老系統常常都是用 IP firewall 擋著):

Q: Can I use fixed IP addresses to access the SFTP server endpoint?
A: No. Fixed IP addresses that are usually used for firewall whitelisting purposes are currently not supported.

另外認證的部份看起來已經包括了常用的認證:

Q: Which compliance programs does AWS SFTP support?
A: AWS SFTP is PCI-DSS and GDPR compliant, and HIPAA eligible.

不過第一次的 SSH key 部份要怎麼取得啊... 有支援 SSHFP 嗎?

支援的區域蠻多的,對台灣使用常見的區域都有在第一波的清單內:

AWS SFTP is available in AWS Regions worldwide including US East (N. Virginia, Ohio), US West (Oregon, N. California), Canada (Central), Europe (Ireland, Paris, Frankfurt, London), and Asia Pacific (Tokyo, Singapore, Sydney, Seoul).

最後是價錢,上傳與下傳都要另外收費 (USD$0.04/GB),另外服務本身的 endpoint 也要收費 (USD$0.3/hour,一個月大約會是 $216),跟自己弄比起來好像不怎麼便宜,目前看起來主要是整合了 IAM 與其他機制... 不過這就是賣服務,看自己取捨就是了 :o

Backblaze 與 Cloudflare 合作,免除傳輸費用

先前知道不少單位會選擇用 CloudFront 的原因就是 S3 到 CloudFront 這段是不需要傳輸費用的。畢竟 CDN 的 hit rate 還是有限,用其他家 CDN 得付這塊費用。

而現在 Backblaze 宣佈跟 Cloudflare 合作,免除掉 Backblaze 到 Cloudflare 的費用:「Backblaze and Cloudflare Partner to Provide Free Data Transfer」。

Today we are announcing that beginning immediately, Backblaze B2 customers will be able to download data stored in B2 to Cloudflare for zero transfer fees.

AWS 這邊會不會有其他動作呢...

AWS 日本區 EC2 與 S3 的傳輸費用降價...

Twitter 看到 Jeff Barr 提到日本區的 EC2S3 傳輸費用降價:

網站的說明文章則是在「AWS Data Transfer Price Reductions – Up to 34% (Japan) and 28% (Australia)」這邊。分成幾個部份降價:

  • 10TB 以下的費用從 USD$0.14/GB 變成 $0.114/GB (約 19%)。
  • 10TB 到 50TB 從 USD$0.135/GB 變成 USD$0.089/GB (約 34%)。
  • 50TB 到 150TB 則是 USD$0.13/GB 變成 USD$0.086/GB (約 34%)。
  • 超過 150TB 的部份從 USD$0.12/GB 變成 USD$0.084/GB (約 30%)。

然後自動回朔到 2018/09/01 開始算:

Effective September 1, 2018 we are reducing prices for data transfer from Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and Amazon CloudFront by up to 34% in Japan and 28% in Australia.

另外有提到 CloudFront 也有降價,以及澳洲的部份,應該不太會碰到所以就跳過去...

但就文章這樣寫明 EC2 與 S3 的流量有降價,應該表示從 ELB 出去的流量就不算在這次的降價?除非你是直接 S3 裸奔,不然對大多數的人應該沒差?

S3 Select 宣佈支援 Parquet 與 bzip2

Amazon S3S3 Select 宣佈支援 Parquet 格式:「Amazon S3 Announces New Features for S3 Select」。

本來 S3 Select 就已經支援 CSV 與 JSON 格式,大多數的引擎也都可以直接吃,這次宣佈支援 JSON Arrays,以及 Parquet 格式:

Today, Amazon S3 Select works on objects stored in CSV and JSON format. Based on customer feedback, we’re happy to announce S3 Select support for Apache Parquet format, JSON Arrays, and BZIP2 compression for CSV and JSON objects. We are also adding support for CloudWatch Metrics for S3 Select, which lets you monitor S3 Select usage for your applications.

另外一個上面也有提到的是宣佈支援 bzip2 格式,不知道有沒有打算支援壓縮率更好的其他格式...

Amazon S3 提供更高的存取量...

AWS 宣佈提高了 Amazon S3 的效能:「Amazon S3 Announces Increased Request Rate Performance」。

每個 S3 prefix 都可以到 5500 RPS read 與 3500 RPS write:

Amazon S3 now provides increased performance to support up to 3,500 requests per second to add data and 5,500 requests per second to retrieve data, which can save significant processing time for no additional charge. Each S3 prefix can support these request rates, making it simple to increase performance exponentially.

舊的資料可以看「Request Rate and Performance Considerations」這邊,裡面沒有明講速度,但有提到如果超過 800 RPS read 與 300 RPS write 的門檻,建議開 case:

However, if you expect a rapid increase in the request rate for a bucket to more than 300 PUT/LIST/DELETE requests per second or more than 800 GET requests per second, we recommend that you open a support case to prepare for the workload and avoid any temporary limits on your request rate.

不過如果有量的話,還是建議照著原來的 prefix 建議,打散處理會比較好,通常在前面的 CDN 通常可以跑簡單的 url rewrite 處理掉 (像是 CloudFront 自家或是 Cloudflare),像是把使用 unix timestamp (ms) 的 https://www.example.com/1531843366123.jpg 變成 https://www.example.com/6123/1531843366123.jpg,這樣可以讓 Amazon S3 的後端依照 prefix 打散 loading,避免當站愈來愈大的時候很難處理。

Amazon S3 推出新的等級 One Zone-IA

Amazon S3 有 RRS,提供給那些掉了可以重新產生的資料使用 (像是縮圖);另外也有 IA,提供給不常存取的資料使用。現在推出的這個等級結合了兩者,使得價錢更低:「Amazon S3 Update: New Storage Class and General Availability of S3 Select」。

New S3 One Zone-IA Storage Class – This new storage class is 20% less expensive than the existing Standard-IA storage class. It is designed to be used to store data that does not need the extra level of protection provided by geographic redundancy.

Amazon EC2 的可用頻寬提昇

AWSJeff Barr 宣佈了有 ENAEC2 instance 的頻寬提升到 25Gbps:「The Floodgates Are Open – Increased Network Bandwidth for EC2 Instances」。

分成三種,第一種是對 S3 的頻寬提昇:

EC2 to S3 – Traffic to and from Amazon Simple Storage Service (S3) can now take advantage of up to 25 Gbps of bandwidth. Previously, traffic of this type had access to 5 Gbps of bandwidth. This will be of benefit to applications that access large amounts of data in S3 or that make use of S3 for backup and restore.

第二種是 EC2 對 EC2 (內網):

EC2 to EC2 – Traffic to and from EC2 instances in the same or different Availability Zones within a region can now take advantage of up to 5 Gbps of bandwidth for single-flow traffic, or 25 Gbps of bandwidth for multi-flow traffic (a flow represents a single, point-to-point network connection) by using private IPv4 or IPv6 addresses, as described here.

第三種也是 EC2 對 EC2,但是是在同一個 Cluster Placement Group:

EC2 to EC2 (Cluster Placement Group) – Traffic to and from EC2 instances within a cluster placement group can continue to take advantage of up to 10 Gbps of lower-latency bandwidth for single-flow traffic, or 25 Gbps of lower-latency bandwidth for multi-flow traffic.

有 ENA 的有這些,好像沒看到 CentOS

ENA-enabled AMIs are available for Amazon Linux, Ubuntu 14.04 & 16.04, RHEL 7.4, SLES 12, and Windows Server (2008 R2, 2012, 2012 R2, and 2016). The FreeBSD AMI in AWS Marketplace is also ENA-enabled, as is VMware Cloud on AWS.

Amazon S3 的流量,以及 S3 與 Glacier 都推出 Select 功能

Twitter 上看到會場的照片,Amazon S3 單一 region 就有 37 Tb/sec 的量:

在這種量下面對 DDoS 沒什麼感覺 XDDD

另外是 Amazon S3 與 Amazon Glacier 都推出了 Select 功能:「S3 Select and Glacier Select – Retrieving Subsets of Objects」。

看示範的程式碼就可以看出用途了,原文中間那段有 sytax error,我這邊就幫忙修掉了:

handler = PrintingResponseHandler()
s3 = boto3.client('s3')
response = s3.select_object_content(
    Bucket="super-secret-reinvent-stuff",
    Key="stuff.csv",
    SelectRequest={
        'ExpressionType': 'SQL',
        'Expression': 'SELECT s._1 FROM S3Object AS s',
        'InputSerialization': {
            'CompressionType': 'NONE',
            'CSV': {
                'FileHeaderInfo': 'IGNORE',
                'RecordDelimiter': 'n',
                'FieldDelimiter': ',',
            }
        },
        'OutputSerialization': {
            'CSV': {
                'RecordDelimiter': 'n',
                'FieldDelimiter': ',',
            }
        }
    }
)

這樣可以大幅降低 I/O,節省成本:

Glacier Select 也是類似的想法,不需要整包拉出來再處理,可以在一開始就設定條件。

AWS Media Services 推出一卡車與影音相關的服務...

AWS 推出了一連串 AWS Elemental MediaOOXX 一連串影音相關的服務:「AWS Media Services – Process, Store, and Monetize Cloud-Based Video」。

但不是所有的服務都是相同的區域... 公告分別在:

不過這邊還是引用 Jeff Barr 文章裡的說明,可以看到從很源頭的 transencoding 到 DRM,以及 Live 格式,到後續的檔案儲存及後製 (像是上廣告) 都有:

AWS Elemental MediaConvert – File-based transcoding for OTT, broadcast, or archiving, with support for a long list of formats and codecs. Features include multi-channel audio, graphic overlays, closed captioning, and several DRM options.

AWS Elemental MediaLive – Live encoding to deliver video streams in real time to both televisions and multiscreen devices. Allows you to deploy highly reliable live channels in minutes, with full control over encoding parameters. It supports ad insertion, multi-channel audio, graphic overlays, and closed captioning.

AWS Elemental MediaPackage – Video origination and just-in-time packaging. Starting from a single input, produces output for multiple devices representing a long list of current and legacy formats. Supports multiple monetization models, time-shifted live streaming, ad insertion, DRM, and blackout management.

AWS Elemental MediaStore – Media-optimized storage that enables high performance and low latency applications such as live streaming, while taking advantage of the scale and durability of Amazon Simple Storage Service (S3).

AWS Elemental MediaTailor – Monetization service that supports ad serving and server-side ad insertion, a broad range of devices, transcoding, and accurate reporting of server-side and client-side ad insertion.

引個前同事的 tweet,先不說 Amazon SWF 的情況 (畢竟 Amazon SWF 還可以找到其他用途),倒是 Amazon Elastic Transcoder 很明顯要被淘汰掉了:

這種整個大包的東西是 AWS re:Invent 才有的能量,平常比較少看到...