robots.txt 的標準化

雖然聽起來有點詭異,但 robots.txt 的確一直都只是業界慣用標準,而非正式標準,所以各家搜尋引擎加加減減都有一些自己的參數。

在經過這麼久以後,Google 決定推動 robots.txt 的標準化:「Formalizing the Robots Exclusion Protocol Specification」,同時 Google 也放出了他們解讀 robots.txt 的 parser:「Google's robots.txt Parser is Now Open Source」,在 GitHubgoogle/robotstxt 這邊可以取得。

目前的 draft 是 00 版,可以在 draft-rep-wg-topic-00 這邊看到,不知道其他搜尋引擎會給什麼樣的回饋...

Python 的 code formatter:Black

Black 是一套 Python 上的 Code Formatter,可以幫你重排程式碼以符合 coding style 與 coding standard,比起只是告訴你哪邊有錯來的更進階...

記得以前好像不是掛在官方帳號下面的,翻了一下發現在 Hacker News 上的「https://news.ycombinator.com/item?id=17151813」這則可以看到,去年在 ambv 的 repository 上,現在則是被導到 python 的組織下了 :o

目前還是掛 beta,另外有不少 practice 讓人不太舒服,像是 Hacker News 上「https://news.ycombinator.com/item?id=19939806」這邊提到的:

Against my better judgment I'll bite.
I super dislike black's formatting, and I think it's really rare to actually see it in codebases. It wraps weirdly (sometimes not at all). I'd prefer to use yapf, but last I checked it still crashes on "f-strings".

Here's a small example:

    basket.add({
        apple.stem
        for satchel in satchels
        for apple in satchel
    })
Black formats this as:
    basket.add(
        {
            apple.stem
            for satchel in satchels
            for apple in satchel
        }
    )
        
I've never seen Python code like that.
I totally believe using a formatter is good practice. Black is in a challenging position of coming into a community with a lot of existing code and customs, and I get that. But I also think that's an opportunity, rather than having to guess at what is good, there's a wealth of prior art to look at. I wish it had done this, rather than essentially codify the author's style.

看起來還有很多可以調整的,然後也可以考慮用看看... 以前是 3rd-party 還可以丟著不管,現在帶有官方色彩得看一下 :o

ACME,RFC 8555

這邊講的是因為 Let's Encrypt 所發明的 ACME 協定,可以協助自動化發憑證的協定。

剛剛看到「Automatic Certificate Management Environment (ACME)」這個頁面,上面標 PROPOSED STANDARD,但點進去的 txt 檔開頭則是 Standards Track 了:

Internet Engineering Task Force (IETF)                         R. Barnes
Request for Comments: 8555                                         Cisco
Category: Standards Track                             J. Hoffman-Andrews
ISSN: 2070-1721                                                      EFF
                                                             D. McCarney
                                                           Let's Encrypt
                                                               J. Kasten
                                                  University of Michigan
                                                              March 2019

不知道是不是兩邊不同步 (或是我對流程有誤會?),但這有一個標準文件可以參考了...

Amazon S3 推出了一個自動分析後分類的 Storage Class

Amazon S3 推出了新的 Storage Class,後面直接用演算法分析 access pattern (所以要跑一陣子才會生效),然後決定要放到 Standard 或是 Standard IA 裡:「Announcing S3 Intelligent-Tiering — a New Amazon S3 Storage Class」。

混了 Standard 與 Standard IA:

S3 Intelligent-Tiering stores objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access.

然後連續 30 天沒有被存取的就會被丟到 Standard IA,如果有被存取的話就會被搬回來,而搬移的部份不用收費:

For a small monthly monitoring and automation fee per object, S3 Intelligent-Tiering monitors access patterns and moves objects that have not been accessed for 30 consecutive days to the infrequent access tier. There are no retrieval fees in S3 Intelligent-Tiering. If an object in the infrequent access tier is accessed later, it is automatically moved back to the frequent access tier. No additional tiering fees apply when objects are moved between access tiers within the S3 Intelligent-Tiering storage class.

從費用上可以看到演算法本身是有費用的,換算一下 1M objects 是 USD$2.5/month,好像還可以...

Monitoring and Automation, All storage / Month$0.0025 per 1,000 objects

不過有蠻多要注意的 pattern。像是這邊有提到 128KB 以下的檔案不會搬到 IA 上,但不知道算不算 Monitoring 的費用?

S3 Intelligent-Tiering has a minimum eligible object size of 128KB for auto-tiering. Smaller objects may be stored but will always be charged at the Frequent Access tier rates.

另外這邊講 S3 Intelligent-Tiering 的三十天也不知道是不是 Standard + Standard IA,或是分開算:

S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA storage are charged for a minimum storage duration of 30 days.

可以先觀望一下...

RFC 8446:TLS 1.3

看到 RFC 8446 (The Transport Layer Security (TLS) Protocol Version 1.3) 正式推出了,也就是 TLS 1.3 正式成為 IETF 的標準 (Standards Track)。

Cloudflare 寫了一篇文章「A Detailed Look at RFC 8446 (a.k.a. TLS 1.3)」描述了 TLS 1.3 的特點,有興趣的人可以看一看,尤其是 1-RTT 的部份對效能幫助很大 (0-RTT 因為 replay attack 問題,我應該暫時都不會考慮,要等到有一個合理的防禦模型出來)。

另外一個是 OpenSSL 目前最新版是 1.1.0h,當初就決定要等 TLS 1.3 正式成為標準才會出 1.1.1 (參考「OpenSSL 1.1.1 將支援 TLS 1.3」,這也熬了一年啊... 支援後會就有很多軟體可以直接套用了,可以來期待了。

GCP 的 f1-micro 的使用心得...

這幾天在弄備援跳板機,不想弄在日本 (latency),所以就想到 Google Cloud Platform (GCP) 在台灣有機房,而 Compute Enginef1-micro,類似 AWSEC2 提供的 t2.nano 的機器。而這兩天玩了玩,大概有些事情值得記錄起來。

CPU 相關的部份:

  • EC2 的 t2 系列可以透過 API 或是在 web console 看到 CPU credit 剩下多少,GCE 沒找到在哪邊可以看。
  • EC2 的 t2 系列在 CPU credit 用完後是變慢運行,除非你打開 T2 Unlimited 同意 AWS 多收錢。而 GCE 的則是沒得選,相當於一定要開 T2 Unlimited。
  • GCE 的 f1-micro 是 0.2 vCPU,但我在上面跑 Ubuntu 18.04,平常沒事就已經是 15% 左右。這數字比預期的高不少,還在找是什麼原因...

網路相關的部份:

  • 因為要用台灣的機房,網路的部份只有 Premium 等級可以選 (Standard 等級目前只在美國有),也就是會先走 Google 佈建的網路再出去,所以流出的費用會隨著 destination 地區而有差異 (i.e. 封包送到美國與送到中國是不同計價)。
  • 但 Premium 等級實測品質真的很不一樣,到香港居然在 15ms 以下,以前在固網機房內沒看過這個數字...

其他的部份:

  • 硬碟空間方面,Standard provisioned space 比 EBSgp2 便宜不少,而且還包括了 i/o 費用 (AWS 會另外收費)
  • 連續使用就會有 discount 了,也可以 commit 買一年或是三年取得更深的 discount。而 AWS 則是得買 Reserved Instance 拿到 discount。

來看看一個月會有多少帳單產生吧...

AWS 推出 EC2 Fleet:直接混搭標準 EC2、Spot、RI 的計算

AWS 將本來 EC2Spot Fleet 加上了 EC2 Fleet,計算的公式從本來只有 Spot Instace,變成把標準 EC2 Instance 與 RI 的計算全部都納進來:「EC2 Fleet – Manage Thousands of On-Demand and Spot Instances with One Request」。

Today we are extending and generalizing the set-it-and-forget-it model that we pioneered in Spot Fleet with EC2 Fleet, a new building block that gives you the ability to create fleets that are composed of a combination of EC2 On-Demand, Reserved, and Spot Instances with a single API call.

不過目前有些服務還沒整,主要是跟 auto scaling 有關的部份,這部份應該是一次上一大包:

We plan to connect EC2 Fleet and EC2 Auto Scaling groups. This will let you create a single fleet that mixed instance types and Spot, Reserved and On-Demand, while also taking advantage of EC2 Auto Scaling features such as health checks and lifecycle hooks. This integration will also bring EC2 Fleet functionality to services such as Amazon ECS, Amazon EKS, and AWS Batch that build on and make use of EC2 Auto Scaling for fleet management.

整完以後對於要省成本就更簡單了...

t2 系列機器的 CPU credit 超出 Quota 的現象

在看 CPU credit 時發現 EC2 上有台 t2.micro CPU credit 一直掉,但是上面沒有跑什麼東西,所以先在內部的 Trac 上開張 ticket 追蹤... 然後這種事情都是一開 ticket 就馬上想到了 @_@

首先發現這些 CPU credit 是超出 max quota 144 的限制 (參考 AWS 的文件「CPU Credits and Baseline Performance」),就馬上想到是因為 t2 系列的機器在開機時會贈送 CPU credit 以避免開機時太慢 (參考文件「T2 Standard」),而贈送的這塊會優先使用,但不吃 max quota:

Launch credits are spent first, before earned credits. Unspent launch credits are accrued in the CPU credit balance, but do not count towards the CPU credit balance limit.

另外系統對每個帳號有限制,每個帳號每 24 小時在每區有 100 次的贈送限制:

There is a limit to the number of times T2 Standard instances can receive launch credits. The default limit is 100 launches or starts of all T2 Standard instances combined per account, per region, per rolling 24-hour period.

新帳號可能會更低,隨著使用情況調昇:

New accounts may have a lower limit, which increases over time based on your usage.

所以就知道為什麼會緩緩下降了,在到 144 之前都應該是下降的趨勢...

Amazon S3 推出新的等級 One Zone-IA

Amazon S3 有 RRS,提供給那些掉了可以重新產生的資料使用 (像是縮圖);另外也有 IA,提供給不常存取的資料使用。現在推出的這個等級結合了兩者,使得價錢更低:「Amazon S3 Update: New Storage Class and General Availability of S3 Select」。

New S3 One Zone-IA Storage Class – This new storage class is 20% less expensive than the existing Standard-IA storage class. It is designed to be used to store data that does not need the extra level of protection provided by geographic redundancy.