AWS 昨天公告了 84 個 /16 的 IPv4 位置

Hacker News 首頁上看到 AWS 昨天公告了 84 個 /16 (IPv4):「AWS adds an extra 5.5M IPv4 addresses (github.com/seligman)」。

這使得 AWS 在整個可用的 IPv4 network 佔的空間從 1.61045% 上升到了 1.75915%,不確定這 84 個 /16 花了多少錢買...

另外看到國外 ISP 的一些作法,發 CGNAT 的 IPv4 位置,以及實際的 IPv6 位置,這樣對於有支援 IPv6 的應用就可以反連回去:

When I lived in Ireland I only got a public IPv6, my IPv4 was behind CG-NAT. The nerd in me wasn't a fan of that on paper, but in reality I didn't have any issues with it.

家裡的第四台還是 IPv4 only 啊 (至少不是 CGNAT,之前被換到 CGNAT 時有去幹繳過),要連 IPv6 資源目前還是只能透過 6to4 去摸,看起來是連到香港的 HE,速度普普通通...

Cloudflare 遞出 S-1

Cloudflare 遞出 Form S-1:「S-1」。TechCrunch 有些整理:「Cloudflare files for initial public offering」。

一樣是燒錢狀態上市,不過相較於其他家燒的速度算是慢的,但成長速度很驚人:

As far as money goes, Cloudflare is — like other early-stage technology companies — losing money. But it’s not losing that much money, and its growth is impressive.

預定在 NYSE 上使用代碼「NET」:

The company will trade on the New York Stock Exchange under the ticker symbol “NET.”

不知道為什麼有種泡沫感...

AI 版的星海爭霸二將直接透過歐洲區的 Battle.net 匿名與人類對戰

前幾天 Blizzard 公佈的消息,DeepMind 的星海爭霸二 AI (AlphaStar) 將會透過 Blizzard 的 Battle.net 歐洲區伺服器跟人類對戰:「DeepMind Research on Ladder」。

Experimental versions of DeepMind’s StarCraft II agent, AlphaStar, will soon play a small number of games on the competitive ladder in Europe as part of ongoing research into AI.

預設是不會對到的,需要選擇參與:

If you would like the chance to help DeepMind with its research by matching against AlphaStar, you can opt in by clicking the “opt-in” button on the in-game popup window. You can alter your opt-in selection at any time by using the “DeepMind opt-in” button on the 1v1 Versus menu.

但你仍然不會知道對手是人還是 AI,而且如同一般對戰情況,這會影響到你的戰績:

For scientific test purposes, DeepMind will be benchmarking AlphaStar’s performance by playing anonymously during a series of blind trial matches. This means the StarCraft community will not know which matches AlphaStar is playing, to help ensure that all games are played under the same conditions. AlphaStar plays with built-in restrictions that the DeepMind team has defined in consultation with pro players. A win or a loss against AlphaStar will affect your MMR as normal.

okay,這樣大概知道為什麼只開放歐洲區了...

加州從今年七月開始,禁止 AI 偽裝成人類 (前幾天也有一些新聞在報導):「A California law now means chatbots have to disclose they’re not human」,對應的法條在「Bill Text - SB-1001 Bots: disclosure」這邊可以看到:

17941. (a) It shall be unlawful for any person to use a bot to communicate or interact with another person in California online, with the intent to mislead the other person about its artificial identity for the purpose of knowingly deceiving the person about the content of the communication in order to incentivize a purchase or sale of goods or services in a commercial transaction or to influence a vote in an election. A person using a bot shall not be liable under this section if the person discloses that it is a bot.

(b) The disclosure required by this section shall be clear, conspicuous, and reasonably designed to inform persons with whom the bot communicates or interacts that it is a bot.

而加州是 Blizzard Entertainment 的總部...

法條上面對「online platform」有設計排除條款,不過如果只算星海二的人數,有可能不到這個豁免限制... 所以得避開而改用歐洲區來測試?

(c) “Online platform” means any public-facing Internet Web site, Web application, or digital application, including a social network or publication, that has 10,000,000 or more unique monthly United States visitors or users for a majority of months during the preceding 12 months.

(c) This chapter does not impose a duty on service providers of online platforms, including, but not limited to, Web hosting and Internet service providers.

美國軍方應該是超級關注這個議題,相較於 AlphaGo 或是 AlphaZero 是資訊完全透明的遊戲,這次要踏入非對稱資訊的遊戲。

如果在這個領域上有成果的話,可以預期未來的戰爭 (yeah 實體戰爭) 會開始大量採用 AI 了...

GitHub 推出 Package Registry

GitHub 推出了「GitHub Package Registry」,可以代管自己開發的軟體。

目前支援 npm (Node.js)、DockerMaven (Java)、NuGet (.NET)、RubyGems (Ruby) 五個平台,

隔壁 GitLab 說我們早就有了而氣噗噗中:「Packaging now standard, dependency proxy next?」。

Anyway,省下一些事情,以前會透過 CI/CD 丟到像是 packagecloud.io 這樣的服務上讓內部使用,現在看起來 GitHub 上面可以解決一些簡單的情境...

分析 FCC 對網路中立性的留言,將鄉民與機器人分開來分析

Boing Boing 的立場其實還蠻鮮明的,所以有時候他們的新聞看看就好...

但這篇真的很有趣,把 FCC (美國的聯邦通信委員會) 上兩千兩百萬則對網路中立性的留言拿出來分析,結果發現真人與機器人的差異超明顯 XDDD:「Analysis of 22 million FCC comments show that humans love Net Neutrality and bots really, really hate it」,引用的文章在「Discovering truth through lies on the internet - FCC comments analyzed」這邊。

分析可以發現,真人偏好網路中立,而機器人反對網路中立 XDDD:(其實大家心裡都有底是怎麼玩出來的... 只是這次有機會分析,讓事情更明顯)

Data analysis company Gravwell ingested 22,000,000 comments sent to the FCC's docket on Net Neutrality and posted their preliminary findings, which are that the majority of comments came from bots, and these bots oppose Net Neutrality; of the comments that appear to originate with humans, the vast majority favor Net Neutrality.

文章中列出幾個有趣的現象,像是機器人的 comment 大量重複:

A very small minority of comments are unique -- only 17.4% of the 22,152,276 total. The highest occurrence of a single comment was over 1 million.

甚至是 pornhub.com 的郵件位置 XDDD:

Most comments were submitted in bulk and many come in batches with obviously incorrect information -- over 1,000,000 comments in July claimed to have a pornhub.com email address

然後機器人的 pattern 也很容易辨別:

Bot herders can be observed launching the bots -- there are submissions from people living in the state of "{STATE}" that happen minutes before a large number of comment submissions

這個有點類似「50 Cent Party (五毛黨)」,但是是自動化機器人,而且產出的「品質」不太好 XDDD

電信商對 Zero Rating 與網路中立性的問題

在「AT&T users will be able to stream DirecTV Now without using their data」這邊才看到 FCC 在這個月月初針對電信商對特定服務的 zero rating 發出警告:「The FCC tells AT&T it may be violating net neutrality with its DirecTV plans」:

AT&T is far from the only US carrier to zero rate data. T-Mobile has been ostentatiously offering free data for music and movies for a year now, and Verizon also zero rates video from its Go90 app. But in zero rating DirecTV, the FCC thinks AT&T may have gone too far.

AT&T 說任何人只要付錢都可以參加這個 plan:

AT&T’s argument is that any company that participates in its Sponsored Data program has to pay AT&T for it, and that includes DirecTV.

但問題還是在 AT&T 擁有 DirecTV,所以是左手付到右手:

Except, again, AT&T owns DirecTV, so even if one division is paying another, the overall company still ends up not paying any money.

而且這筆金額其實不小:

The situation for other companies is very different — and the FCC believes that the price they’d have to pay is “significant[.]”

不過總統快換人了,很有可能會往更糟的方向前進...

Perl 上的 Monkey Patch

這整個週末都在跟 Net::DNS 奮戰 edns-client-subnet,遇到模組內的一小段程式有 bug,先用 monkey patch 硬上,之後再看看要怎麼丟 patch 回 upstream。

monkey patch 的方法主要是參考「How can I monkey-patch an instance method in Perl?」這邊提供的方法而來的。

由於實際的行為是 subroutine redefined (會產生警告訊息),所以要局部關掉 warnings,然後再把整個 subroutine 換掉:

use BugPackage;

{
    no warnings;
    local *BugPackage::bug_function = sub {
        # new code
    };
}

這樣可以在不修改原始模組程式碼的情況下抽換。

Stack Overflow 公開 2016 的架構

Stack Overflow 公開了 2016 年現在的系統架構:「Stack Overflow: The Architecture - 2016 Edition」。

Stack Overflow 的重要性可以從前陣子 Twitter 上流傳的一張讓大家笑的很開心的圖看出來:

身為目前「程序猿」(!) 最重要的 debug (!!) 資料來源,而且是目前少數用 ASP.NETMicrosoft SQL Server 作為網站與資料庫的架構,並且是放在傳統 IDC 機房而非 Cloud Service 的知名網站,大家也很好奇他們是怎麼堆出來的。

上次公開 Stack Overflow 的系統架構是 2013 年年底了 (參考當時寫的「Stack Overflow 的現況...」這篇),這份更新距離上次兩年多了,也有很多可以交叉比較的事情。

比較有趣的是效能的提昇的說明,本來以為會是說因為我們改善程式碼的效率或是其他類似的理由,結果居然直接說是因為買新機器了 XDDD:

You may be wondering about the drastic ASP.Net reduction in processing time compared to 2013 (which was 757 hours) despite 61 million more requests a day. That’s due to both a hardware upgrade in early 2015 as well as a lot of performance tuning inside the applications themselves.

另外覺得比較有趣的是 CiscoASR-1001ASR-1001-x,不知道是什麼理由選擇這個系列,改天找 Cisco 的朋友問問看好了...

另外他們的 Websockets 也拿來做有趣的事情:

We use websockets to push real-time updates to users such as notifications in the top bar, vote counts, new nav counts, new answers and comments, and a few other bits.

另外他們也發現有些瀏覽器連線已經連 18 個月了 (喂喂),也許應該去看一下人是不是還活著:

Fun fact: some of those browsers have been open for over 18 months. We’re not sure why. Someone should go check if those developers are still alive.

我猜是 production server 上開瀏覽器查資料後沒關掉,就一直連著...

取得某些 tld 所有的 Domain Name

在「How to Download a List of All Registered Domain Names」這邊介紹了 .com、.net 以及 .name 的 zone file 怎麼申請與下載 (由 Verisign 維護的三個 tld)。

另外也介紹了 Centralized Zone Data Service,包括所有的 tld,以及 ICANN 提供的 API。

這樣應該有很多資料可以分析?

網路黑市的歷史資料

在「Black-market archives」這篇給出了一份很寶貴的資料,是來自於 Tor hidden service 上的 Dark Net Markets (DNM)。

這份資料涵蓋了 2013 到 2015 年的各種紀錄:

From 2013-2015, I scraped/mirrored on a weekly or daily basis all existing English-language DNMs as part of my research into their usage, lifetimes/characteristics, & legal riskiness; these scrapes covered vendor pages, feedback, images, etc.

大約壓縮後 50GB 的資料:

This uniquely comprehensive collection is now publicly released as a 50GB (~1.6TB) collection covering 89 DNMs & 37+ related forums, representing <4,438 mirrors, and is available for any research.

Tor 的 hidden service 應該只會愈來愈流行,初期的這些資料會讓後人有很多題材可以分析...