英國新的紙鈔將會使用 Alan Turing 頭像

新版 50 英鎊的紙鈔將使用 Alan Turing 的頭像設計:「New face of the Bank of England's £50 note is revealed as Alan Turing」。

不知道要怎麼介紹 Alan Turing... 對於現代計算理論的貢獻、對於二戰盟軍的貢獻,以及對於人工智慧的貢獻都無與倫比,另外一方面,在 1952 年時因同性戀身份被定罪,1954 年時食用氰化物自殺過世,然後到了 2013 年議會爭論赦免的過程中,英國女皇決定直接行使赦免權。現在則是決定以他的頭像作為五十英鎊的人物。

既然靠這個圈子吃飯的,應該會蒐藏一張起來吧,紀念這位英雄...

Vim 的 Performance Profiling

在「Profiling Vim」這邊看到 Vim 上常見的效能問題,很多時候會覺得「慢」但不知道慢在哪邊的問題...

文章作者已經知道是開 Markdown 檔案時的問題,所以可以在開啟 Vim 後用下指令啟動 profiling,但如果是想要追蹤一開起來很慢的問題,可以看「對 Vim 啟動過程做效能分析」這篇的方式,直接在啟動時加上 --startuptime vim.log 這樣的語法,把 log 寫到 vim.log 裡面。

紐約時報的 The Privacy Project 分析了這二十年來 Google 的隱私條款

紐約時報The Privacy Project 分析了 Google 在這二十年來的 Privacy Policy (英文版),可以看出網路廣告產業的變化,以及為什麼變得極力蒐集個資與使用者行為:「Google’s 4,000-Word Privacy Policy Is a Secret History of the Internet」。整篇看起來有點長,可以先看裡面的小標題,然後看一下列出來的條文差異,把不同時間的重點都列出來了。

最早期的轉變是「針對性」:

1999-2004
No longer talks about users ‘in aggregate’

1999 年的版本強調了整體性,後來因為針對性廣告而被拿掉:

1999
Google may share information about users with advertisers, business partners, sponsors, and other third parties. However, we only talk about our users in aggregate, not as individuals. For example, we may disclose how frequently the average Google user visits Google, or which other query words are most often used with the query word "Microsoft."

接下來的是蒐集的項目大幅增加,讓分析更準確:

2005-2011
Google shares more data for better targeting

然後是更多產品線互相使用使用者行為資訊:

2012-2017
Its complicated business requires a more complicated policy

接下來是因為法規而配合修改條文 (最有名的就是 GDPR):

2018-PRESENT
Policy adjusts to meet stricter regulation

Cloudflare 因為 Regular Expression 炸掉的問題

先前 Cloudflare 就有先說明七月二日的 outage 是因為 regular expression 造成的 (ReDoS),不過昨天發的文章更完整了,導致爆炸的 regular expression 都給出來了:「Details of the Cloudflare outage on July 2, 2019」。

ReDoS 不算是新的問題,但卻是不太好避免的問題,因為需要有經驗的工程師 (中過獎的工程師) 才比較容易知道哪些 regular expression 是有問題的... 另外就是有花時間研究 regular expression 演算法的工程師也比較容易避開。

也因次,ReDoS 算是這十年來大家在還的債,各家 framework 都因為這個問題改寫了不少 regular expression。

這次的重點在這串式子導致了 ReDoS:

(?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))

通常容易中獎的地方就是無限制字元與 * & + 連發的地方,後面這塊 )*.*(?:.*=.*))) 看起來就不太妙,果然在後面的分析也有提到:

The critical part is .*(?:.*=.*).

以前應該是在 Formal language 裡學到的,在課堂裡面其實會學到不少業界常用工具的基礎理論...

AI 版的星海爭霸二將直接透過歐洲區的 Battle.net 匿名與人類對戰

前幾天 Blizzard 公佈的消息,DeepMind 的星海爭霸二 AI (AlphaStar) 將會透過 Blizzard 的 Battle.net 歐洲區伺服器跟人類對戰:「DeepMind Research on Ladder」。

Experimental versions of DeepMind’s StarCraft II agent, AlphaStar, will soon play a small number of games on the competitive ladder in Europe as part of ongoing research into AI.

預設是不會對到的,需要選擇參與:

If you would like the chance to help DeepMind with its research by matching against AlphaStar, you can opt in by clicking the “opt-in” button on the in-game popup window. You can alter your opt-in selection at any time by using the “DeepMind opt-in” button on the 1v1 Versus menu.

但你仍然不會知道對手是人還是 AI,而且如同一般對戰情況,這會影響到你的戰績:

For scientific test purposes, DeepMind will be benchmarking AlphaStar’s performance by playing anonymously during a series of blind trial matches. This means the StarCraft community will not know which matches AlphaStar is playing, to help ensure that all games are played under the same conditions. AlphaStar plays with built-in restrictions that the DeepMind team has defined in consultation with pro players. A win or a loss against AlphaStar will affect your MMR as normal.

okay,這樣大概知道為什麼只開放歐洲區了...

加州從今年七月開始,禁止 AI 偽裝成人類 (前幾天也有一些新聞在報導):「A California law now means chatbots have to disclose they’re not human」,對應的法條在「Bill Text - SB-1001 Bots: disclosure」這邊可以看到:

17941. (a) It shall be unlawful for any person to use a bot to communicate or interact with another person in California online, with the intent to mislead the other person about its artificial identity for the purpose of knowingly deceiving the person about the content of the communication in order to incentivize a purchase or sale of goods or services in a commercial transaction or to influence a vote in an election. A person using a bot shall not be liable under this section if the person discloses that it is a bot.

(b) The disclosure required by this section shall be clear, conspicuous, and reasonably designed to inform persons with whom the bot communicates or interacts that it is a bot.

而加州是 Blizzard Entertainment 的總部...

法條上面對「online platform」有設計排除條款,不過如果只算星海二的人數,有可能不到這個豁免限制... 所以得避開而改用歐洲區來測試?

(c) “Online platform” means any public-facing Internet Web site, Web application, or digital application, including a social network or publication, that has 10,000,000 or more unique monthly United States visitors or users for a majority of months during the preceding 12 months.

(c) This chapter does not impose a duty on service providers of online platforms, including, but not limited to, Web hosting and Internet service providers.

美國軍方應該是超級關注這個議題,相較於 AlphaGo 或是 AlphaZero 是資訊完全透明的遊戲,這次要踏入非對稱資訊的遊戲。

如果在這個領域上有成果的話,可以預期未來的戰爭 (yeah 實體戰爭) 會開始大量採用 AI 了...

Fabrice Bellard 的 QuickJS

Fabrice Bellard 跑去寫了一套 JavaScript engine 出來:「QuickJS」。

以 ES2019 當底實做的 JS engine:

Almost complete ES2019 support including modules, asynchronous generators and full Annex B support (legacy web compatibility).

測試的部份也過了:

Passes 100% of the ECMAScript Test Suite.

在大小的部份,比起其他的 engine (與 package) 來說的確是小很多,不過 190KB 這個大小對於 embedded system 來說還是有點微妙 (但對於想要包 JS engine 進去用的人應該是頗開心的):

Small and easily embeddable: just a few C files, no external dependency, 190 KiB of x86 code for a simple hello world program.

不愧是 Fabrice Bellard,搞出了 LZEXEFFmpegQEMU 後跑來搞 JS...

把只有標題的 feed 轉成有全文的 feed

中央社華視新聞網都有 feed 可以訂閱 (參考「中央社 RSS資訊服務 訂閱」與「華視新聞網 - RSS訂閱」),但都不是全文 (公視新聞網的反而就直接提供全文了,不需要另外處理),在 Bazqux (一個 feed reader) 裡面讀起來頗麻煩的,想要把這些 feed 轉成有全文的 feed...

這東西不算難寫 (最重要的部份用 Readability 當關鍵字,很多程式語言裡都有 library 可以用),但總是想看看有沒有現成的服務可以直接用,就找到「Free Full RSS」這個了,把本來的 feed 網址丟進去,出來的就是轉成全文的網址了...

目前看起來還 ok,不知道穩定性如何,用一陣子再說...

Amazon 的 Elasticsearch 服務提供十四天免費 hourly snapshot

Amazon Elasticsearch Service 提供 14 天免費的 hourly snapshot:「Amazon Elasticsearch Service increases data protection with automated hourly snapshots at no extra charge」。

Amazon Elasticsearch Service has increased its snapshot frequency from daily to hourly, providing more granular recovery points. If you need to restore your cluster, you now have numerous, recent snapshots to choose from. These automated snapshots are retained for 14 days at no extra charge.

不過這是 5.3+ 版本才有,舊版只有 daily:

  • For domains running Elasticsearch 5.3 and later, Amazon ES takes hourly automated snapshots and retains up to 336 of them for 14 days.
  • For domains running Elasticsearch 5.1 and earlier, Amazon ES takes daily automated snapshots (during the hour you specify) and retains up to 14 of them for 30 days.

In both cases, the service stores the snapshots in a preconfigured Amazon S3 bucket at no additional charge. You can use these automated snapshots to restore domains.

算是方便管理...

Idempotent Bash Script

看到「How to write idempotent Bash scripts」這篇,重點在講 Idempotence,這個詞是從數學上借來的,講重複操作的不動性:

An element x of a magma (M, •) is said to be idempotent if:

x • x = x.

If all elements are idempotent with respect to •, then • is called idempotent. The formula ∀x, x • x = x is called the idempotency law for •.

在 CS 領域也是一樣的概念:

Idempotence (UK: /ˌɪdɛmˈpoʊtəns/, US: /ˌaɪdəm-/) is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application.

而這篇講的是 Bash 上有些常見的行為要怎麼改成 Idempotence:

It happens a lot, you write a bash script and half way it exits due an error. You fix the error in your system and run the script again. But half of the steps in your scripts fail immediately because they were already applied to your system. To build resilient systems you need to write software that is idempotent.

一個常見的例子是 cron job 是否可以重複執行的問題。

如果 cron job 裡的程式都是 idempotent,那麼就不需要擔心重跑會因為前一隻 script 產生的環境而失敗,導致無限循環而需要人介入...

另外一個更進階的是同時有兩個 process 在執行同一個 script (可能在不同機器上),這也是要考慮的問題,不過這個問題在大多數情況下有各種 lock 系統可以協助避免,應該不是太大的問題...

市場上有很多 VPN 都是由中國公司在後面營運

在「Hidden VPN owners unveiled: 97 VPN products run by just 23 companies」這篇分析了 VPN 產業裡面背後的公司。

其中有兩個比較重要的事情,第一個是很多公司 (或是集團) 都擁有多個 VPN 品牌 (甚至有到十個品牌的),所以如果想要透過多家 VPN 分散風險時,在挑的時候要看一下:

另外一個是後面有多中國人或是中國公司在營運:

We discovered that a good amount of the free mobile-only VPNs are owned by Chinese companies, or companies run by Chinese nationals.

  • Innovative Connecting (10 VPN apps): Director Danian “Danny” Chen is a Chinese national (Chen’s LinkSure is the sole shareholder and shares the same address as Innovative Connecting)
  • Hotspot VPN (5 VPN apps): Director Zhu Jianpeng has a residential address in Heibei Province in China
  • Hi Security (3 apps): the VPN apps are part of Shenzhen HAWK Internet, a subsidiary of the Chinese major company TCL Corporation
  • SuperSoftTech (2 apps): while officially owned by Singapore-based SuperSoftTech, it actually belongs to independent app publisher Jinrong Zheng, a Chinese national based in Beijing.
  • LEILEI (2 apps): by the titles of the VPNs (all written in Chinese characters), it’s likely that this developer is Chinese or based in China
  • Newbreed Network Pte.Ltd (6 apps): again, while it has a Singapore address, the websites for its VPN apps SGreen VPN and NodeVPN are completely in Chinese, while NodeVPN’s site lists the People’s Republic of China as its location.

這些公司與產品都應該要直接避開... 在有能力的情況下,在 public cloud 上自己架設還是會比較保險。