Stripe 原來支援 JCB 了啊...

剛剛在買東西的時候故意丟 JCB 的卡號進去,發現 Stripe 認得,找了一下公告資料,發現是去年 2020 年五月支援的:「Expanding support for JCB payments」。

先前在日本買 Live 物販的時候 (2019 年年底,應該是 H-el-icalSee-Saw 這兩場),看到現場是使用 iPad + Stripe 的組合,一開始還驚訝了一下,但被告知不支援 JCB 的時候心裡「...」了一陣子,只能刷 Mastercard 或是 Visa

看起來在去年推出的時候,日本地區是自動開放:

Businesses using Stripe in Japan can now automatically accept payments with JCB, in most cases without any additional work.

其他地區則是逐步開放:

We are rolling out JCB acceptance to businesses in more countries, starting with Canada, Australia, and New Zealand, with more to come. This lets global businesses, from e-commerce sites in Canada to subscription services in Australia, easily transact with JCB cardholders.

如果 2022 年有機會去日本的話,應該會看到更多使用 Stripe 的方案了...

Stripe 遇到 AWS 上 DNS Resolver 的限制

當量夠大就會遇到各種限制...

這次 Stripe 在描述 trouble shooting 的過程:「The secret life of DNS packets: investigating complex networks」。

其中一個頗有趣的架構是他們在每台主機上都有跑 Unbound,然後導去中央的 DNS Resolver,再決定導去 Consul 或是 AWS 的 DNS Resolver:

Unbound runs locally on every host as well as on the DNS servers.

然後他們發現偶而會有大量的 SERVFAIL

接下來就是各種找問題的過程 (像是用 tcpdump 看情況,然後用 iptables 統計一些數字),最後發現是卡在 AWS 的 DNS Resolver 在 60 秒內只回應了 61,385 packets,換算差不多是 1,023 packets/sec,這數字看起來就很雷:

During one of the 60-second collection periods the DNS server sent 257,430 packets to the VPC resolver. The VPC resolver replied back with only 61,385 packets, which averages to 1,023 packets per second. We realized we may be hitting the AWS limit for how much traffic can be sent to a VPC resolver, which is 1,024 packets per second per interface. Our next step was to establish better visibility in our cluster to validate our hypothesis.

在官方文件「Using DNS with Your VPC」這邊看到對應的說明:

Each Amazon EC2 instance limits the number of packets that can be sent to the Amazon-provided DNS server to a maximum of 1024 packets per second per network interface. This limit cannot be increased. The number of DNS queries per second supported by the Amazon-provided DNS server varies by the type of query, the size of response, and the protocol in use. For more information and recommendations for a scalable DNS architecture, see the Hybrid Cloud DNS Solutions for Amazon VPC whitepaper.

iptables 看到的量則是:

找到問題後,後面就是要找方法解決了... 他們給了一個只能算是不會有什麼副作用的 workaround,不過也的確想不到太好的解法。

因為是查詢 10.0.0.0/8 網段反解產生大量的查詢,所以就在各 server 上的 Unbound 上指定這個網段直接問 AWS 的 DNS Resolver,不需要往中央的 DNS Resolver 問,這樣在這個場景就不會遇到 1024 packets/sec 問題了 XDDD

Stripe 將 Redis 單機版轉到 Cluster 版本上降低了錯誤率

在「Scaling a High-traffic Rate Limiting Stack With Redis Cluster」這邊提到了 StripeRedis 單機版轉移到 10 個節點的 cluster 版本,然後錯誤率大幅下降:

Stripe’s rate limiters are built on top of Redis, and until recently, they ran on a single very hot instance of Redis. The server had followers in place for failover, but at any given time, one node was handling every operation.

We eventually solved it by migrating to a 10-node Redis Cluster.

另外也可以看出來,在轉移到 cluster 版本後有不少要注意的,像是因為 sharding 而需要調整平衡性。另外是 cluster 模式下寫入的 confirmation 跟一般預期的不太一樣,不過這對於 rate limit 的應用還好,可以接受某種程度的掉資料...

Stripe 宣佈 TLS 1.0/1.1 的退場時間表

Stripe 宣佈了今年的 2/19 會停用測試環境的 TLS 1.0/1.1,並且在 6/13 全面停用:「Completing an upgrade to TLS 1.2」。

  • Monday, February 19: All servers using older versions of TLS will be blocked from the Stripe API in test mode.
  • Wednesday, June 13: All servers using older versions of TLS will be blocked from the Stripe API in live mode.

這喊好久了,總算是開槍了...

隔壁棚 PayPal 反而早就把 Sandbox 環境上 TLS 1.2-only 了,而六月底也會強制所有連線都必須是 TLS 1.2:「TLS 1.2 and HTTP/1.1 Upgrade - PayPal」。

Stripe 也宣佈要停止支援 Bitcoin 了

Stripe 發了公告出來,要停止支援 Bitcoin:「Ending Bitcoin Support」。預定保留三個月的緩衝期,在 2018 年 4 月 23 日停掉:

Over the next three months we will work with affected Stripe users to ensure a smooth transition before we stop processing Bitcoin transactions on April 23, 2018.

跟其他單位停用的原因都差不多,愈來愈慢的交易速度與愈來愈高的手續費:

Transaction confirmation times have risen substantially; this, in turn, has led to an increase in the failure rate of transactions denominated in fiat currencies. (By the time the transaction is confirmed, fluctuations in Bitcoin price mean that it’s for the “wrong” amount.) Furthermore, fees have risen a great deal. For a regular Bitcoin transaction, a fee of tens of U.S. dollars is common, making Bitcoin transactions about as expensive as bank wires.

Steam 當時的理由很類似... (參考「Steam 停止使用 Bitcoin 購買遊戲」)

拿到標著 Stripe, Inc. 的 EV Certificate

在「Nope, this isn’t the HTTPS-validated Stripe website you think it is」這邊看到利用一家同名的公司取得相同的 EV Certificate 的方法:

這個不是你想像中的 Stripe 網站在 https://stripe.ian.sh/ 這邊。這在 Safari 上攻擊的效果特別好,因為不會顯示出 hostname:

不過 EV Certificate 當初設計的目的本來就是墊高 phishing 的成本,並不是完全阻止... 這應該暫時不會解吧 :o

Stripe 香港開台,以及 Alipay 與 WeChat Pay 的支援

看到 Stripe 的幾個大動作:「Stripe in Hong Kong + Alipay and WeChat Pay globally」。

一個是進入香港的消息:

Today, we’re excited to officially launch Stripe in Hong Kong.

另外一個是 Alipay (支付寶) 以及 WeChat Pay (微信支付) 可以透過 Stripe 在全球使用:

So, today we’re introducing global support for Alipay and WeChat Pay, connecting Stripe businesses in 25+ countries to the hundreds of millions of Chinese consumers that actively use these payment methods.

尤其是後面的消息,對於中國的使用者方便不少...

Stripe 的 Increment 雜誌

Stripe 推出了 Increment 雜誌,講團隊合作時的各種議題:「Introducing Increment」。

And so we've decided to start Increment, a software engineering magazine dedicated to providing practical and useful insight into what effective teams are doing so that the rest of us can learn from them more quickly.

雜誌網站上也有類似的描述:

A digital magazine about how teams build and operate software systems at scale.

Increment is dedicated to covering how teams build and operate software systems at scale, one issue at a time.

可以看一看 Stripe 對團隊合作的想法...

Stripe 對於控制 API 使用量的作法

在「Scaling your API with rate limiters」這篇 StripePaul Tarjan 提到了四種如何保護 API 的作法。

前兩種都是 rate limit。第一種是最標準的「你一分鐘可以用幾次」的方式,這是最容易理解的方式。第二種是「你同時間可以用幾個 API request」,這通常會用在大量消耗資源的 API 上,避免短時間內被打爆。

第三種是拉到整體來看,把 API 分成重要與不重要的,然後直接保留確保重要的 API 有一定的 capacity 可以用:

We always reserve a fraction of our infrastructure for critical requests. If our reservation number is 20%, then any non-critical request over their 80% allocation would be rejected with status code 503.

第四種方式是當過載時的自動化處理,平常就把各種工作排優先順序,當超量的時候自動先將低優先權的拿掉:

Only 100 requests were rejected this month from this rate limiter, but in the past it’s done a lot to help us recover more quickly when we have had load problems. This load shedder limits the impact of incidents that are already happening and provides damage control, while the first three are more preventative.

不過還是有點怪,Stripe 應該是全部都建在 AWS 上面 (AWS Case Study: Stripe),跟 auto scaling 的配合好像都沒提到?

歸類的方式還蠻有條理的,可以學這個方法來規劃...

Stripe 所提到的 TLS 1.1 不安全

Stripe 在宣佈要淘汰 TLS 1.0 與 TLS 1.1 的計畫公告中 (「Upgrading to SHA-2 and TLS 1.2」) 提到了:

Why SHA-1, TLS 1.0 and 1.1 are insecure

但在文章裡面還是沒有提到為什麼 TLS 1.1 不安全。

在維基百科的「Transport Layer Security」條目中試著找內容,發現應該是 Data integrity 這段,TLS 1.1 不支援 HMAC-SHA256/384 與 AEAD,只支援比較弱的 HMAC-MD5 或是 HMAC-SHA1。