Tag Archives: s3

AWS 推出更便宜的儲存方案 Glacier Deep Archive

AWS 推出的這個方案價錢又更低了:「New Amazon S3 Storage Class – Glacier Deep Archive」。

在這之前在 us-east-1S3 最低的方案是 Glacier Storage,單價是 USD$0.004/GB (也就是 $4/TB)。

而這次推出的 Glacier Deep Archive Storage 在同一區則是直接到 USD$0.00099/GB ($0.99/TB),大約是 1/4 的價錢。

Glacier Deep Archive 在取得時 first byte 的保證時間是 12 小時,另外最低消費是 180 天:

Retrieval time within 12 hours

先前就有的 Glacier Storage 則是可以在取用時設定取得的 pattern (會影響 first byte 的時間),而最低消費是 90 天:

Configurable retrieval times, from minutes to hours

Pricing for each of these metrics is determined by the speed at which data is requested based on three options. "Expedited" queries <250 MB are typically returned in 1-5 minutes. "Standard" queries are typically returned in 3-5 hours. "Bulk" queries are typically returned in 5-12 hours.

多一個更便宜的選擇可以用。

AWS 推出了 Live 時全自動上字幕的功能

AWS 推出了在直播時就自動上字幕的功能:「Introducing Live Streaming with Automated Multi-Language Subtitling」,其實就是把現有的服務兜出來:「Live Streaming with Automated Multi-Language Subtitling」。

The solution deploys Live Streaming on AWS which includes AWS Elemental MediaLive, MediaPackage, Amazon CloudFront. The solution also deploys AWS Lambda, Amazon Simple Storage Service, Amazon Transcribe, and Amazon Translate.

對於比較沒那麼要求翻譯品質的情況也許可以玩看看...?

Cloudflare 發表 Logpush,把 log 傳出來

Cloudflare 推出了 Logpush,可以把 log 傳到 Amazon S3 或是 Google Cloud Storage 上 (目前就支援這兩個):「Logpush: the Easy Way to Get Your Logs to Your Cloud Storage」,目前是 enterprise 方案可以用:

Today, we’re excited to announce a new way to get your logs: Logpush, a tool for uploading your logs to your cloud storage provider, such as Amazon S3 or Google Cloud Storage. It’s now available in Early Access for Enterprise domains.

Logpush currently works with Amazon S3 and Google Cloud Storage, two of the most popular cloud storage providers.

以往是自己拉,需要考慮 HA 問題 (兩台小機器就可以了,機器成本是還好,但維護成本頗高),現在則是可以讓 Cloudflare 主動推上來...

我大多數的建議是 log 儘量留 (包括各種系統的 log 與 http access log)。主要是因為儲存成本一直下降 (S3 的 1TB 才 USD$23/month,如果使用 Glacier 會低到 USD$4/month),而且 log 在壓縮後會變小很多 (經常會有重複的字)。

最傳統的 AWStats 就可以看到不少資訊,如果是透過 IP address 與 User-agent,交叉配合其他 event data 就可以讓很多 machine learning 演算法取得的資訊更豐富。

AWS 要再推出更低價的 S3 Glacier Deep Archive

AWS 打算要推出 S3 的新產品,單價更低但反應時間更長的儲存方案:「Coming Soon – S3 Glacier Deep Archive for Long-Term Data Retention」。

設計的客群是對法規有要求的情況:

It is designed for customers — particularly those in highly-regulated industries, such as the Financial Services, Healthcare, and Public Sectors — that retain data sets for 7-10 years or longer to meet regulatory compliance requirements.

看起來是那種 log 丟著就不會去動的情境... (Write once never read?XD)

AWS 推出了 S3 Object Lock,保護資料被刪除的可能性

AWS 推出了 S3 Object Lock,可以設定條件鎖住 S3 上的 object,以保護資料不被刪除:「AWS Announces Amazon S3 Object Lock in all AWS Regions」。

這個功能跟會計做帳的概念很像,也就是寫進去後就不能改,也不能刪除,保留一定時間後才移除掉:

You can migrate workloads from existing write-once-read-many (WORM) systems into Amazon S3, and configure S3 Object Lock at the object- and bucket-levels to prevent object version deletions prior to pre-defined Retain Until Dates or Legal Hold Dates.

AWS 提供有兩種模式,一個是 Governance mode,這個模式下可以設定某些 IAM 權限可以移除 S3 Object Lock。另外一個是 Compliance mode,這個模式下連 root account 都不能刪除:

S3 Object Lock can be configured in one of two modes. When deployed in Governance mode, AWS accounts with specific IAM permissions are able to remove object locks from objects. If you require stronger immutability to comply with regulations, you can use Compliance Mode. In Compliance Mode, the protection cannot be removed by any user, including the root account.

Amazon S3 推出了一個自動分析後分類的 Storage Class

Amazon S3 推出了新的 Storage Class,後面直接用演算法分析 access pattern (所以要跑一陣子才會生效),然後決定要放到 Standard 或是 Standard IA 裡:「Announcing S3 Intelligent-Tiering — a New Amazon S3 Storage Class」。

混了 Standard 與 Standard IA:

S3 Intelligent-Tiering stores objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access.

然後連續 30 天沒有被存取的就會被丟到 Standard IA,如果有被存取的話就會被搬回來,而搬移的部份不用收費:

For a small monthly monitoring and automation fee per object, S3 Intelligent-Tiering monitors access patterns and moves objects that have not been accessed for 30 consecutive days to the infrequent access tier. There are no retrieval fees in S3 Intelligent-Tiering. If an object in the infrequent access tier is accessed later, it is automatically moved back to the frequent access tier. No additional tiering fees apply when objects are moved between access tiers within the S3 Intelligent-Tiering storage class.

從費用上可以看到演算法本身是有費用的,換算一下 1M objects 是 USD$2.5/month,好像還可以...

Monitoring and Automation, All storage / Month$0.0025 per 1,000 objects

不過有蠻多要注意的 pattern。像是這邊有提到 128KB 以下的檔案不會搬到 IA 上,但不知道算不算 Monitoring 的費用?

S3 Intelligent-Tiering has a minimum eligible object size of 128KB for auto-tiering. Smaller objects may be stored but will always be charged at the Frequent Access tier rates.

另外這邊講 S3 Intelligent-Tiering 的三十天也不知道是不是 Standard + Standard IA,或是分開算:

S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA storage are charged for a minimum storage duration of 30 days.

可以先觀望一下...

Amazon S3 推出 SFTP

Amazon 居然推出對 Amazon S3SFTP 服務了:「Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3」。

這個服務是掛在 AWS Transfer 的名義下 (AWS Transfer for SFTP),這對老系統來說可以省一些事情,不過目前還不支援固定 IP,這樣不太能直接搬上去 (這種老系統常常都是用 IP firewall 擋著):

Q: Can I use fixed IP addresses to access the SFTP server endpoint?
A: No. Fixed IP addresses that are usually used for firewall whitelisting purposes are currently not supported.

另外認證的部份看起來已經包括了常用的認證:

Q: Which compliance programs does AWS SFTP support?
A: AWS SFTP is PCI-DSS and GDPR compliant, and HIPAA eligible.

不過第一次的 SSH key 部份要怎麼取得啊... 有支援 SSHFP 嗎?

支援的區域蠻多的,對台灣使用常見的區域都有在第一波的清單內:

AWS SFTP is available in AWS Regions worldwide including US East (N. Virginia, Ohio), US West (Oregon, N. California), Canada (Central), Europe (Ireland, Paris, Frankfurt, London), and Asia Pacific (Tokyo, Singapore, Sydney, Seoul).

最後是價錢,上傳與下傳都要另外收費 (USD$0.04/GB),另外服務本身的 endpoint 也要收費 (USD$0.3/hour,一個月大約會是 $216),跟自己弄比起來好像不怎麼便宜,目前看起來主要是整合了 IAM 與其他機制... 不過這就是賣服務,看自己取捨就是了 :o

Backblaze 與 Cloudflare 合作,免除傳輸費用

先前知道不少單位會選擇用 CloudFront 的原因就是 S3 到 CloudFront 這段是不需要傳輸費用的。畢竟 CDN 的 hit rate 還是有限,用其他家 CDN 得付這塊費用。

而現在 Backblaze 宣佈跟 Cloudflare 合作,免除掉 Backblaze 到 Cloudflare 的費用:「Backblaze and Cloudflare Partner to Provide Free Data Transfer」。

Today we are announcing that beginning immediately, Backblaze B2 customers will be able to download data stored in B2 to Cloudflare for zero transfer fees.

AWS 這邊會不會有其他動作呢...

AWS 日本區 EC2 與 S3 的傳輸費用降價...

Twitter 看到 Jeff Barr 提到日本區的 EC2S3 傳輸費用降價:

網站的說明文章則是在「AWS Data Transfer Price Reductions – Up to 34% (Japan) and 28% (Australia)」這邊。分成幾個部份降價:

  • 10TB 以下的費用從 USD$0.14/GB 變成 $0.114/GB (約 19%)。
  • 10TB 到 50TB 從 USD$0.135/GB 變成 USD$0.089/GB (約 34%)。
  • 50TB 到 150TB 則是 USD$0.13/GB 變成 USD$0.086/GB (約 34%)。
  • 超過 150TB 的部份從 USD$0.12/GB 變成 USD$0.084/GB (約 30%)。

然後自動回朔到 2018/09/01 開始算:

Effective September 1, 2018 we are reducing prices for data transfer from Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and Amazon CloudFront by up to 34% in Japan and 28% in Australia.

另外有提到 CloudFront 也有降價,以及澳洲的部份,應該不太會碰到所以就跳過去...

但就文章這樣寫明 EC2 與 S3 的流量有降價,應該表示從 ELB 出去的流量就不算在這次的降價?除非你是直接 S3 裸奔,不然對大多數的人應該沒差?

S3 Select 宣佈支援 Parquet 與 bzip2

Amazon S3S3 Select 宣佈支援 Parquet 格式:「Amazon S3 Announces New Features for S3 Select」。

本來 S3 Select 就已經支援 CSV 與 JSON 格式,大多數的引擎也都可以直接吃,這次宣佈支援 JSON Arrays,以及 Parquet 格式:

Today, Amazon S3 Select works on objects stored in CSV and JSON format. Based on customer feedback, we’re happy to announce S3 Select support for Apache Parquet format, JSON Arrays, and BZIP2 compression for CSV and JSON objects. We are also adding support for CloudWatch Metrics for S3 Select, which lets you monitor S3 Select usage for your applications.

另外一個上面也有提到的是宣佈支援 bzip2 格式,不知道有沒有打算支援壓縮率更好的其他格式...