日本圍棋界使用 AWS 分析棋局的情況

看到「圍棋AI與AWS」這篇譯文,原文是「囲碁AIブームに乗って、若手棋士の間で「AWS」が大流行 その理由とは?」。

沒有太意外是使用 Leela Zero + Lizzle,畢竟這是 open source project,在軟體與資料的取得上相當方便,而且在好的硬體上已經可以超越人類頂尖棋手。

由於在 Lizzle 的介面上可以看到勝率,以及 Leela Zero 考慮的下一手 (通常會有多個選點),而且當游標移到這些選點上以後,還會有可能的變化圖可以看,所以對於棋手在熟悉操作介面後,可以很快的擺個變化圖,然後讓 Leela Zero 分析後續的發展,而棋手就可以快速判斷出「喔喔原來是這樣啊」。

網路上也有類似的自戰解說,可以看到棋手對 Lizzle 的操作與分析 (大約從 50:50 開始才是 Lizzle 的操作):

不過話說回來,幹壞事果然是進步最大的原動力... 讓一群對 AWS 沒什麼經驗的圍棋棋手用起 AWS,而且還透過 AMI 與 spot instance 省錢... XD

EC2 開始陸續推出支援 100Gbps 網路的機器

AWS 開始陸陸續續在推出有 100Gbps 能力的 EC2 instance 了:「New – EC2 P3dn GPU Instances with 100 Gbps Networking & Local NVMe Storage for Faster Machine Learning + P3 Price Reduction」。

從「Amazon EC2 Instance Types」這邊可以看到先前只有 c5n.18xlarge 有支援 100Gbps 網路,現在推出的 p3dn.24xlarge 是第二個支援的...

另外是 P3 系列的降價消息,比較奇怪的是從 2018/12/06 開始生效,而不是從月初開始。另外區域與條件也有一些複雜,有常在用的人可以翻一下說明...

AWS 推出 NVIDIA Tesla V100 的 GPU 機種 (P3)

這次從 NVIDIA 推出到 AWS 採用比以前快好多,依照維基百科上查到的資料,Tesla V1000 是六月 21 日推出的,這樣算起來四個月後就上線了:「New – Amazon EC2 Instances with Up to 8 NVIDIA Tesla V100 GPUs (P3)」。

Today we are making the next generation of GPU-powered EC2 instances available in four AWS regions. Powered by up to eight NVIDIA Tesla V100 GPUs, the P3 instances are designed to handle compute-intensive machine learning, deep learning, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, and genomics workloads.

P2 是 K80,NVIDIA 推出的時間是 2014 年的十一月了... 這時候拿出新的貨色出來算合理 XD

然後是這幾區有提供 P3:

P3 instances are available in the US East (Northern Virginia), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo) Regions in On-Demand, Spot, Reserved Instance, and Dedicated Host form.

從頭學一次 Isolation level 的 REPEATABLE-READ 與 SERIALIZABLE

在「測試 MariaDB 上 Galera Cluster 的 Isolation」這篇提到了 StripeKyle Kingsbury 寫了「Call me Maybe: MariaDB Galera Cluster」這篇文章,在討論 Galera Cluster 的 Isolation 問題。

Percona 的 CTO Vadim Tkachenko 寫了回應的文章:「Clarification on “Call me Maybe: MariaDB Galera Cluster”」。

看完後再跑去翻了一些資料,首先是 SQL 92 對 Isolation 的原始定義,尤其是對 REPEATABLE-READ 的定義。

REPEATABLE-READ 是透過 P2 ("Non-repeatable read") 來定義的,只要保證不會發生 P2 就是 REPEATABLE-READ 了:

P2 ("Non-repeatable read"): SQL-transaction T1 reads a row. SQL-transaction T2 then modifies or deletes that row and performs a COMMIT. If T1 then attempts to reread the row, it may receive the modified value or discover that the row has been deleted.

也就是說,T1 讀過的 row 只要在 T1 transaction 的期間內都保持一樣就可以。由於沒有多餘其他保證,所以有機會產生 P3 ("Phantom") 問題:

P3 ("Phantom"): SQL-transaction T1 reads the set of rows N that satisfy some . SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same <search condition>, it obtains a different collection of rows.

SERIALIZABLEREPEATABLE-READ 的定義方式不同,並不是避免 P3 就可以叫做 SERIALIZABLE。SQL 92 定義說明必須與序列執行的結果一樣才能叫做 SERIALIZABLE

The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable. A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions. A serial execution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.

所以依照定義,SERIALIZABLE 一定可以避免 P3 發生,但避免了 P3 不代表做到 SERIALIZABLE

另外一份資料是「MySQL :: MySQL 5.6 Reference Manual :: 13.3.6 SET TRANSACTION Syntax」對於 MySQL 5.6 中 InnoDB 實作 REPEATABLE-READ 的細節:

This is the default isolation level for InnoDB. For consistent reads, there is an important difference from the READ COMMITTED isolation level: All consistent reads within the same transaction read the snapshot established by the first read. This convention means that if you issue several plain (nonlocking) SELECT statements within the same transaction, these SELECT statements are consistent also with respect to each other. See Section 14.2.2.2, “Consistent Nonlocking Reads”.

InnoDB 的 REPEATABLE-READ 會在第一次讀取時建立一份 snapshot,所以 MySQL 的 REPEATABLE-READ 有達到不產生 P2 的要求,但也因為 snapshot 的關係而不會產生 P3,不過也就僅此而已,寫入的部份還是沒有達到 SERIALIZABLE 的要求。

回到原來的文章所整理出來的測試資料 (我稍微排版過),在這兩個不同的 transaction 同時發生時:(不同人都轉帳給 id = 8)

BEGIN; -- T1
SELECT balance FROM accounts WHERE account_id = 5;
SELECT balance FROM accounts WHERE account_id = 8;
UPDATE account SET balance = 75 WHERE account_id = 5;
UPDATE account SET balance = 125 WHERE account_id = 8;
COMMIT;
BEGIN; -- T2
SELECT balance FROM accounts WHERE account_id = 6;
SELECT balance FROM accounts WHERE account_id = 8;
UPDATE account SET balance = 80 WHERE account_id = 6;
UPDATE account SET balance = 120 WHERE account_id = 8;
COMMIT;

SERIALIZABLE 保證執行結果會與某種循序的結果相符 (可能是 T1 -> T2,也可能是 T2 -> T1)。但 REPEATABLE-READ 只保證在 transaction 裡面看到的 row 不會變化。

在 MySQL 裡,如果你想要跟 MySQL 告知「變更的結果會收到某些欄位的值的影響」,你至少要用 LOCK IN SHARE MODE 來做 (或是用 FOR UPDATE 會更清楚表示出意思),也就是:

BEGIN; -- T1
SELECT balance FROM accounts WHERE account_id = 5 LOCK IN SHARE MODE;
SELECT balance FROM accounts WHERE account_id = 8 LOCK IN SHARE MODE;
UPDATE account SET balance = 75 WHERE account_id = 5;
UPDATE account SET balance = 125 WHERE account_id = 8;
COMMIT;
BEGIN; -- T2
SELECT balance FROM accounts WHERE account_id = 6 LOCK IN SHARE MODE;
SELECT balance FROM accounts WHERE account_id = 8 LOCK IN SHARE MODE;
UPDATE account SET balance = 80 WHERE account_id = 6;
UPDATE account SET balance = 120 WHERE account_id = 8;
COMMIT;

這樣至少會有一個 transaction 會出現 deadlock 訊息,於是就會確保是正確的結果:

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

值得一提的是,Percona 文章提到的第二種解法是有問題的:

BEGIN;
SELECT balance FROM accounts WHERE account_id = 5;
SELECT balance FROM accounts WHERE account_id = 8;
UPDATE account SET balance = balance - 25 WHERE account_id = 5;
UPDATE account SET balance = balance + 25 WHERE account_id = 8;
COMMIT;

這樣「總和」會因為保證原子性 (atomicity) 所以不會產生問題,但 balance 可能會產生出負數。

所以結論是 Kyle Kingsbury 那篇文章測試的方法本來就有問題,在 SELECT 時少了 LOCK IN SHARE MODE 或是 FOR UPDATE 的要求。