Category Archives: AWS

日本圍棋界使用 AWS 分析棋局的情況

看到「圍棋AI與AWS」這篇譯文,原文是「囲碁AIブームに乗って、若手棋士の間で「AWS」が大流行 その理由とは?」。

沒有太意外是使用 Leela Zero + Lizzle,畢竟這是 open source project,在軟體與資料的取得上相當方便,而且在好的硬體上已經可以超越人類頂尖棋手。

由於在 Lizzle 的介面上可以看到勝率,以及 Leela Zero 考慮的下一手 (通常會有多個選點),而且當游標移到這些選點上以後,還會有可能的變化圖可以看,所以對於棋手在熟悉操作介面後,可以很快的擺個變化圖,然後讓 Leela Zero 分析後續的發展,而棋手就可以快速判斷出「喔喔原來是這樣啊」。

網路上也有類似的自戰解說,可以看到棋手對 Lizzle 的操作與分析 (大約從 50:50 開始才是 Lizzle 的操作):

不過話說回來,幹壞事果然是進步最大的原動力... 讓一群對 AWS 沒什麼經驗的圍棋棋手用起 AWS,而且還透過 AMI 與 spot instance 省錢... XD

Open Distro for Elasticsearch 的比較

先前提到的「AWS 對 Elastic Stack 實作免費的開源版本 Open Distro for Elasticsearch」,在「Open Distro for Elasticsearch Review」這邊有整理了一份重點:

可以看到主要重點都在安全性那塊...

Packagist.org 要搬到 AWS 上...

Packagist 打算把服務丟上 AWS:「An Update on Packagist.org Hosting」。

We decided to migrate the packagist.org website to AWS as well, including the database, metadata update workers, etc. This makes it much easier to build a highly-available setup, where machines can be rebooted or even rebuilt safely without bringing down the whole site.

不知道會搬多少上去... 目前 .compackagist.com 已經上 AWS 了,但 packagist.org 與主要的流量來源 repo.packagist.org 看起來都還沒。

Amazon Transcribe 增加許多客製化功能

Amazon TranscribeAWS 提供的語音辨識服務,最近發表了不少可以客製化的功能:「Amazon Transcribe enhances custom vocabulary with custom pronunciations and display forms」。

一個是可以增加字彙,包括他的發音 (不過得透過 IPA 標,這邊會需要學一些東西):

You can give Amazon Transcribe more information about how to process speech in your input audio or video file by creating a custom vocabulary. A custom vocabulary is a list of specific words that you want Amazon Transcribe to recognize in your audio input. These are generally domain-specific words and phrases, words that Amazon Transcribe isn't recognizing, or proper nouns.

Now, with the use of characters from the International Phonetic Alphabet (IPA), you can enhance each custom terminology with corresponding custom pronunciations. Alternatively, you can also use the standard orthography of the language to mimic the way that the word or phrase sounds.

另外是定義詞彙的標示方法:

Additionally, you can now designate exactly how a customer terminology should be displayed when it is transcribed (e.g. “Street” as “St.” versus “ST”).

這對於專有名詞的部份應該是很好用?像是人名...

Amazon Aurora with PostgreSQL 支援 Logical Replication

AWS 先前宣佈 Amazon Aurora (MySQL) 支援 GTID Replication (參考「Amazon Aurora with MySQL 5.7 支援 GTID」),現在則是宣佈 Amazon Aurora with PostgreSQL 支援 Logical Replication:「Amazon Aurora with PostgreSQL Compatibility Supports Logical Replication」。

如同預期的,要新版的才支援:

Logical replication is supported with Aurora PostgreSQL versions 2.2.0 and 2.2.1, compatible with PostgreSQL 10.6.

有 Logical Replication 可以多做很多事情,像是雲端與外部 PostgreSQL 服務的串接 (e.g. 即時拉一份到 IDC 機房)。另外有些 ETL 工具也可以透過這個方式取得資料庫上改變了什麼東西。

AWS 推出更便宜的儲存方案 Glacier Deep Archive

AWS 推出的這個方案價錢又更低了:「New Amazon S3 Storage Class – Glacier Deep Archive」。

在這之前在 us-east-1S3 最低的方案是 Glacier Storage,單價是 USD$0.004/GB (也就是 $4/TB)。

而這次推出的 Glacier Deep Archive Storage 在同一區則是直接到 USD$0.00099/GB ($0.99/TB),大約是 1/4 的價錢。

Glacier Deep Archive 在取得時 first byte 的保證時間是 12 小時,另外最低消費是 180 天:

Retrieval time within 12 hours

先前就有的 Glacier Storage 則是可以在取用時設定取得的 pattern (會影響 first byte 的時間),而最低消費是 90 天:

Configurable retrieval times, from minutes to hours

Pricing for each of these metrics is determined by the speed at which data is requested based on three options. "Expedited" queries <250 MB are typically returned in 1-5 minutes. "Standard" queries are typically returned in 3-5 hours. "Bulk" queries are typically returned in 5-12 hours.

多一個更便宜的選擇可以用。

Amazon Aurora with MySQL 5.7 支援 GTID

雖然在 AWS 上服務的 HA 大多都不需要自己管理,但備份機制 (甚至異地備援) 還是要自己規劃,Amazon Aurora with MySQL 的 GTID 功能算是讓這塊多了一個選擇:「Amazon Aurora with MySQL 5.7 Compatibility Supports GTID-Based Replication」。

公告裡面有提到 Aurora 自己的 replication 還是用自己的機制,而非透過 GTID 做的:

This provides complete consistency when using binlog replication between an Aurora database and an external MySQL database. Your replication won’t miss transactions or generate conflicts, even after failover or downtime. (Note that replication within an Aurora cluster doesn't use binlog files, so the GTID feature doesn't apply.)

不過就 Aurora 的架構來說,整個 cluster 比較像是看作一個整體,用 binlog + position 應該是夠用的?也不會有 failover 時的 conflict 問題?不確定用 GTID 的好處會在哪邊,還得再想看看...

AWS 推出了 Live 時全自動上字幕的功能

AWS 推出了在直播時就自動上字幕的功能:「Introducing Live Streaming with Automated Multi-Language Subtitling」,其實就是把現有的服務兜出來:「Live Streaming with Automated Multi-Language Subtitling」。

The solution deploys Live Streaming on AWS which includes AWS Elemental MediaLive, MediaPackage, Amazon CloudFront. The solution also deploys AWS Lambda, Amazon Simple Storage Service, Amazon Transcribe, and Amazon Translate.

對於比較沒那麼要求翻譯品質的情況也許可以玩看看...?

AWS 的 OpenJDK 11 (Amazon Corretto 11) 推出 General Availability 版

先前在「AWS 決定花力氣支援 OpenJDK (Corretto 計畫)」與「Amazon 版的 OpenJDK 8 進入 GA」後的下一步,就是對 OpenJDK 11 也推出對應的 Amazon Corretto 11:「Amazon Corretto 11 is Now Generally Available」。

這個版本將至少支援到 2024 年 8 月,也就是五年的支援期:

Long-term support (LTS) for Corretto includes performance enhancements and security updates for Corretto 8 until at least June 2023 at no cost. Updates are planned to be released quarterly. Amazon will provide LTS for Corretto 11 with quarterly updates until at least August 2024.

不過先前有些軟體測試時發現 OpenJDK 11 上不能跑,這些軟體還是得暫時用 OpenJDK 8 的版本來養...

AWS 對 Elastic Stack 實作免費的開源版本 Open Distro for Elasticsearch

Elasticsearch 的主體是 Apache License 2.0,但 Elastic Stack (以前叫做 X-Pack) 則是需要付費使用的功能,其中包括了不少跟安全有關的項目在裡面,所以其實有不少人抱怨過產品凌駕安全性的問題,像是「ES 6.3: X-Pack Licence is "Expired" on New Install」這篇官方回應的:

A basic license is not entitled to security features. To try out security you need to use a trial license or obtain a subscription.

AWS 這次則是出手實作了他們自己的版本,叫做 Open Distro for Elasticsearch:「New – Open Distro for Elasticsearch」。

如果你看文章說明,他列出來的 feature 全部都是在 Elastic Stack 這頁上列出來的項目,針對性的意思其實很清楚了:

In addition to Elasticsearch and Kibana, the first release includes a set of advanced security, event monitoring & alerting, performance analysis, and SQL query features (more on those in a bit).

而前面提到的安全性功能也包括在內:

Security – This plugin that supports node-to-node encryption, five types of authentication (basic, Active Directory, LDAP, Kerberos, and SAML), role-based access controls at multiple levels (clusters, indices, documents, and fields), audit logging, and cross-cluster search so that any node in a cluster can run search requests across other nodes in the cluster.

目前支援 Docker Image 與 RPM,之後看看有沒有機會出 deb 版本:

In addition to the source code repo, Open Distro for Elasticsearch and Kibana are available as RPM and Docker containers, with separate downloads for the SQL JDBC and the PerfTop CLI.

這樣應該會讓 Elasticsearch 的服務模式受到很大的影響,來看 Elastic N.V. Ordinary Shares Real Time Stock Quotes 這邊會掉多少...