RTorrent 0.9.8

RTorrent 算是我在 command line 下很喜歡用的 bittorrent client,前幾天釋出 0.9.8 版,距離上一個版本 0.9.7 一年多了:「RTorrent release version 0.9.8」。

從 changelog 可以看到目前主要都還是一些維護性質的修改,像是 bugfix 以及對新的 library 的更新,功能增加的不多...

另外一個是最近提供了 donate 的管道:「Donate to rTorrent development」。在 2017 年的時候有寫信問他有沒有 Patreon,他當時說他人在日本沒辦法處理,看起來後來解決了...

紐約時報的 The Privacy Project 分析了這二十年來 Google 的隱私條款

紐約時報The Privacy Project 分析了 Google 在這二十年來的 Privacy Policy (英文版),可以看出網路廣告產業的變化,以及為什麼變得極力蒐集個資與使用者行為:「Google’s 4,000-Word Privacy Policy Is a Secret History of the Internet」。整篇看起來有點長,可以先看裡面的小標題,然後看一下列出來的條文差異,把不同時間的重點都列出來了。

最早期的轉變是「針對性」:

1999-2004
No longer talks about users ‘in aggregate’

1999 年的版本強調了整體性,後來因為針對性廣告而被拿掉:

1999
Google may share information about users with advertisers, business partners, sponsors, and other third parties. However, we only talk about our users in aggregate, not as individuals. For example, we may disclose how frequently the average Google user visits Google, or which other query words are most often used with the query word "Microsoft."

接下來的是蒐集的項目大幅增加,讓分析更準確:

2005-2011
Google shares more data for better targeting

然後是更多產品線互相使用使用者行為資訊:

2012-2017
Its complicated business requires a more complicated policy

接下來是因為法規而配合修改條文 (最有名的就是 GDPR):

2018-PRESENT
Policy adjusts to meet stricter regulation

Cloudflare 因為 Regular Expression 炸掉的問題

先前 Cloudflare 就有先說明七月二日的 outage 是因為 regular expression 造成的 (ReDoS),不過昨天發的文章更完整了,導致爆炸的 regular expression 都給出來了:「Details of the Cloudflare outage on July 2, 2019」。

ReDoS 不算是新的問題,但卻是不太好避免的問題,因為需要有經驗的工程師 (中過獎的工程師) 才比較容易知道哪些 regular expression 是有問題的... 另外就是有花時間研究 regular expression 演算法的工程師也比較容易避開。

也因次,ReDoS 算是這十年來大家在還的債,各家 framework 都因為這個問題改寫了不少 regular expression。

這次的重點在這串式子導致了 ReDoS:

(?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))

通常容易中獎的地方就是無限制字元與 * & + 連發的地方,後面這塊 )*.*(?:.*=.*))) 看起來就不太妙,果然在後面的分析也有提到:

The critical part is .*(?:.*=.*).

以前應該是在 Formal language 裡學到的,在課堂裡面其實會學到不少業界常用工具的基礎理論...

AI 版的星海爭霸二將直接透過歐洲區的 Battle.net 匿名與人類對戰

前幾天 Blizzard 公佈的消息,DeepMind 的星海爭霸二 AI (AlphaStar) 將會透過 Blizzard 的 Battle.net 歐洲區伺服器跟人類對戰:「DeepMind Research on Ladder」。

Experimental versions of DeepMind’s StarCraft II agent, AlphaStar, will soon play a small number of games on the competitive ladder in Europe as part of ongoing research into AI.

預設是不會對到的,需要選擇參與:

If you would like the chance to help DeepMind with its research by matching against AlphaStar, you can opt in by clicking the “opt-in” button on the in-game popup window. You can alter your opt-in selection at any time by using the “DeepMind opt-in” button on the 1v1 Versus menu.

但你仍然不會知道對手是人還是 AI,而且如同一般對戰情況,這會影響到你的戰績:

For scientific test purposes, DeepMind will be benchmarking AlphaStar’s performance by playing anonymously during a series of blind trial matches. This means the StarCraft community will not know which matches AlphaStar is playing, to help ensure that all games are played under the same conditions. AlphaStar plays with built-in restrictions that the DeepMind team has defined in consultation with pro players. A win or a loss against AlphaStar will affect your MMR as normal.

okay,這樣大概知道為什麼只開放歐洲區了...

加州從今年七月開始,禁止 AI 偽裝成人類 (前幾天也有一些新聞在報導):「A California law now means chatbots have to disclose they’re not human」,對應的法條在「Bill Text - SB-1001 Bots: disclosure」這邊可以看到:

17941. (a) It shall be unlawful for any person to use a bot to communicate or interact with another person in California online, with the intent to mislead the other person about its artificial identity for the purpose of knowingly deceiving the person about the content of the communication in order to incentivize a purchase or sale of goods or services in a commercial transaction or to influence a vote in an election. A person using a bot shall not be liable under this section if the person discloses that it is a bot.

(b) The disclosure required by this section shall be clear, conspicuous, and reasonably designed to inform persons with whom the bot communicates or interacts that it is a bot.

而加州是 Blizzard Entertainment 的總部...

法條上面對「online platform」有設計排除條款,不過如果只算星海二的人數,有可能不到這個豁免限制... 所以得避開而改用歐洲區來測試?

(c) “Online platform” means any public-facing Internet Web site, Web application, or digital application, including a social network or publication, that has 10,000,000 or more unique monthly United States visitors or users for a majority of months during the preceding 12 months.

(c) This chapter does not impose a duty on service providers of online platforms, including, but not limited to, Web hosting and Internet service providers.

美國軍方應該是超級關注這個議題,相較於 AlphaGo 或是 AlphaZero 是資訊完全透明的遊戲,這次要踏入非對稱資訊的遊戲。

如果在這個領域上有成果的話,可以預期未來的戰爭 (yeah 實體戰爭) 會開始大量採用 AI 了...

把只有標題的 feed 轉成有全文的 feed

中央社華視新聞網都有 feed 可以訂閱 (參考「中央社 RSS資訊服務 訂閱」與「華視新聞網 - RSS訂閱」),但都不是全文 (公視新聞網的反而就直接提供全文了,不需要另外處理),在 Bazqux (一個 feed reader) 裡面讀起來頗麻煩的,想要把這些 feed 轉成有全文的 feed...

這東西不算難寫 (最重要的部份用 Readability 當關鍵字,很多程式語言裡都有 library 可以用),但總是想看看有沒有現成的服務可以直接用,就找到「Free Full RSS」這個了,把本來的 feed 網址丟進去,出來的就是轉成全文的網址了...

目前看起來還 ok,不知道穩定性如何,用一陣子再說...

Amazon 的 Elasticsearch 服務提供十四天免費 hourly snapshot

Amazon Elasticsearch Service 提供 14 天免費的 hourly snapshot:「Amazon Elasticsearch Service increases data protection with automated hourly snapshots at no extra charge」。

Amazon Elasticsearch Service has increased its snapshot frequency from daily to hourly, providing more granular recovery points. If you need to restore your cluster, you now have numerous, recent snapshots to choose from. These automated snapshots are retained for 14 days at no extra charge.

不過這是 5.3+ 版本才有,舊版只有 daily:

  • For domains running Elasticsearch 5.3 and later, Amazon ES takes hourly automated snapshots and retains up to 336 of them for 14 days.
  • For domains running Elasticsearch 5.1 and earlier, Amazon ES takes daily automated snapshots (during the hour you specify) and retains up to 14 of them for 30 days.

In both cases, the service stores the snapshots in a preconfigured Amazon S3 bucket at no additional charge. You can use these automated snapshots to restore domains.

算是方便管理...

市場上有很多 VPN 都是由中國公司在後面營運

在「Hidden VPN owners unveiled: 97 VPN products run by just 23 companies」這篇分析了 VPN 產業裡面背後的公司。

其中有兩個比較重要的事情,第一個是很多公司 (或是集團) 都擁有多個 VPN 品牌 (甚至有到十個品牌的),所以如果想要透過多家 VPN 分散風險時,在挑的時候要看一下:

另外一個是後面有多中國人或是中國公司在營運:

We discovered that a good amount of the free mobile-only VPNs are owned by Chinese companies, or companies run by Chinese nationals.

  • Innovative Connecting (10 VPN apps): Director Danian “Danny” Chen is a Chinese national (Chen’s LinkSure is the sole shareholder and shares the same address as Innovative Connecting)
  • Hotspot VPN (5 VPN apps): Director Zhu Jianpeng has a residential address in Heibei Province in China
  • Hi Security (3 apps): the VPN apps are part of Shenzhen HAWK Internet, a subsidiary of the Chinese major company TCL Corporation
  • SuperSoftTech (2 apps): while officially owned by Singapore-based SuperSoftTech, it actually belongs to independent app publisher Jinrong Zheng, a Chinese national based in Beijing.
  • LEILEI (2 apps): by the titles of the VPNs (all written in Chinese characters), it’s likely that this developer is Chinese or based in China
  • Newbreed Network Pte.Ltd (6 apps): again, while it has a Singapore address, the websites for its VPN apps SGreen VPN and NodeVPN are completely in Chinese, while NodeVPN’s site lists the People’s Republic of China as its location.

這些公司與產品都應該要直接避開... 在有能力的情況下,在 public cloud 上自己架設還是會比較保險。

Amazon 需要對賣出去的產品造成的傷害負責

前幾天還蠻引人注目的案件,Amazon 被判決要對平台商家透過 Amazon 平台賣出去的產品負責:「Federal appeals court says Amazon is liable for third-party sellers' products」。

這個案例裡面是消費者透過 Amazon 的平台,向上架的商家購買 hoverboard (懸浮滑板?),結果把消費者家給搞爆了:

Last year, a judge in Tennessee ruled the company was not liable for damages caused by a defective hoverboard that exploded, burning down a family's house.

目前最新的判決中指出,Amazon 在合約裡面簽訂消費者必須透過 Amazon 的平台跟賣家溝通,使得賣家與消費者之間沒有直接的管道可以處理爭議,所以 Amazon 不能免責:

"Amazon fails to account for the fact that under the Agreement, third-party vendors can communicate with the customers only through Amazon," the ruling states. "This enables third-party vendors to conceal themselves from the customer, leaving customers injured by defective products with no direct recourse to the third-party vendor."

這個判決看起來會影響蠻大的,因為這些條款就是希望維持平台業者可以從中獲利,現在反過來殺傷自身... 看起來上訴是跑不掉的?等幾個月後再回來看...

移除 Blog 上的 Google Analytics,改用 Matomo

跑了快一個月了,大概整理一下...

一直都有在規劃降低對 Google 服務的依賴性,最主要的是使用 DuckDuckGo 替代 Google Search (但搜尋的品質還是差一截,所以寫了一些工具幫助我在不滿意的時候可以快速切到 Google 搜尋:「在 DuckDuckGo 搜尋頁快速切換到 Google 的套件」)。

而最近在研究的另外一個服務是 Google Analytics,我只用很基本的功能 (像是熱門文章,作業系統與瀏覽器的比率這些很基本的資料),不需要對於觀看客群有了解 (這個需要像 Google Analytics 跨站蒐集資料),所以替代方案應該不難找...

憑著印象與一些關鍵字,找到了 Matomo,這是一套 open source 的 web analytics 服務,以前叫做 Piwik (參考「Piwik is now Matomo - Announcement」),整個系統用 PHP + MySQL 就可以打發 (反正量不大的東西不需要拿什麼神兵利器出來,MySQL 硬塞硬算就可以了),接著把本來 Google Analytics 的 js 換掉就行了...

跑了快一個月後感覺還 ok,基本的資訊都有...