AWS 提供 VPC Traffic Mirroring 的功能

以前在機房可以在 switch 上用 port mirror 看流量內容找問題,現在在 AWS 上也提供類似的功能 VPC Traffic Mirroring:「New – VPC Traffic Mirroring – Capture & Inspect Network Traffic」。

所以所有以前在傳統機房使用 switch 的技術,都可以在 AWS 上重新發展出來,所以不算太意外的是第一波就有一堆 partner 提供服務,或是一些公司提供經驗。

另外 AWS 的 VPC Traffic Mirroring 比以前 switch 的 port mirror 更彈性,可以把整個網路當來源,或是指定特定的 ENI 當來源:

Mirror Source – An AWS network resource that exists within a particular VPC, and that can be used as the source of traffic. VPC Traffic Mirroring supports the use of Elastic Network Interfaces (ENIs) as mirror sources.

然後除了可以打到 ENI 上,也可以打到 NLB 上:

Mirror Target – An ENI or Network Load Balancer that serves as a destination for the mirrored traffic. The target can be in the same AWS account as the Mirror Source, or in a different account for implementation of the central-VPC model that I mentioned above.

不免俗的,可以過濾封包:

Mirror Filter – A specification of the inbound or outbound (with respect to the source) traffic that is to be captured (accepted) or skipped (rejected). The filter can specify a protocol, ranges for the source and destination ports, and CIDR blocks for the source and destination. Rules are numbered, and processed in order within the scope of a particular Mirror Session.

然後有判斷 session 的能力 (看這邊的敘述,應該就是指 stateful connection?):

Traffic Mirror Session – A connection between a mirror source and target that makes use of a filter. Sessions are numbered, evaluated in order, and the first match (accept or reject) is used to determine the fate of the packet. A given packet is sent to at most one target.

而且這一次公佈就幾乎開放所有區域了,費用看起來也不太貴:

VPC Traffic Mirroring is available now and you can start using it today in all commercial AWS Regions except Asia Pacific (Sydney), China (Beijing), and China (Ningxia). Support for those regions will be added soon. You pay an hourly fee (starting at $0.015 per hour) for each mirror source; see the VPC Pricing page for more info.

Instagram 改善影片上架速度的方式

不是什麼魔法,其實是改產品面上的規格 (但是發表到 Instagram Engineering 上):「Video Upload Latency Improvements at Instagram」。

最原始的版本是所有的格式都轉完後才可以上架:

然後把規格改成最高畫質的版本轉完後就可以先上架:

The idea is, instead of blocking until all video versions are available, we can publish the video once the highest-quality video version is available.

然後是把影片切段上傳,所以傳一半就可以先處理一半,變成 pipeline 的概念,但增加程式的複雜度,以及被迫要調整影片品質的參數:

Segmented uploads reduce upload latency in many cases but come with a few tradeoffs. For instance, segmented uploads increase the complexity of the pipeline. There are some quality metrics that are only available per segment at transcode time, such as SSIM. These metrics are not helpful to us on a per segment basis. Therefore, we need to do a duration weighted average of the SSIM of all segments to come up with the SSIM of the whole video. Similarly, handling exceptions is more complex since there are more cases to handle.

另外有一種特例是上傳的影片本身就已經符合伺服器的規格,這樣的話可以直接放行 (不過這樣不會有 security concern 嗎...):

Another performance optimization we use to improve the upload latency and save CPU utilization is something we call a “passthrough” upload. In some cases, the media that is uploaded is already ready for playback on most devices.

都是想的出來而且會帶有 tradeoff 的方法,而不是完全正面的改善 :o

假新聞產生器與偵測器

Hacker News 上看到的消息,是關於「使用類神經網路產生新聞」(也就是透過程式大量產生假新聞),這次的結果包括了「產生」與「偵測」兩個面向:「Grover – A State-of-the-Art Defense Against Neural Fake News (allenai.org)」。

實驗的網站在「Grover - A State-of-the-Art Defense against Neural Fake News」這邊,另外也有論文「Defending Against Neural Fake News」可以讀。

幾個月前,OpenAI 利用類神經網路,研發出「自動寫新聞」的程式,當時他們宣稱因為效果太好,決定不完整公開成果:「Better Language Models and Their Implications」,中文的報導可以參考 iThome 這篇:「AI文字產生技術引發假新聞爭議,OpenAI決定只公開部份技術成果」。

而現在 The Allen Institute for Artificial Intelligence 則是成功重製了 OpenAI 的成果,取名叫 Grover,發現訓練出來的模型除了可以拿來寫新聞外,也可以拿來偵測文章是不是機器產生的,而且就他們自己測試,辨識成功率還蠻高的:

To study and detect neural fake news, we built a model named Grover. Our study presents a surprising result: the best way to detect neural fake news is to use a model that is also a generator. The generator is most familiar with its own habits, quirks, and traits, as well as those from similar AI models, especially those trained on similar data, i.e. publicly available news. Our model, Grover, is a generator that can easily spot its own generated fake news articles, as well as those generated by other AIs. In a challenging setting with limited access to neural fake news articles, Grover obtains over 92% accuracy at telling apart human-written from machine-written news. Please read our publication for more information.

不過看起來 source code 與 model 還是沒放出來,但看起來遲早會有對應的 open source clone...

我想到在攻殼電視動畫裡面的情報管制戰,雖然電視動畫裡沒有講得很詳細,但感覺這類工具就是其中一環...

幫你的 iPhone 電話簿找到對應的頭像

前幾天看到的:「Announcing Vignette」,透過 social network 的資料,把本來電話簿裡面的 icon 更新:

透過 app store 的搜尋找不太到,我一開始用了「Vignette」搜不到,但用「Vignette Update」就可以。或者你可以透過他提供的連結直接開 app store:「Vignette – Update Contact Pics」。

這是一個 IAP 類的付費服務,搜尋是免費的,但如果要把資料更新回通訊錄,需要付 USD$4.99 (一次性),台灣帳號是付 TWD$170,應該是因為最近的稅務調整:

Vignette allows you to scan your contacts and see what it can find for free. If you wish to actually save these updates to your contact list, you must pay for a one-time in-app purchase. That purchase costs $4.99, is not a subscription, and is the only in-app purchase.

搜尋的範圍包括了 GravatarTwitterFacebookInstagram

Email is used for Gravatar
Twitter
Facebook
A custom network called Instagram

另外作者有提到這個 app 不傳資料到伺服器上,都是在自己的裝置上連到上面提到的 social network 尋找:

Privacy is paramount
All the processing is done on-device; this isn’t the sort of app where your contacts are uploaded en masse to some server, and out of your control.

所以速度不會太快,但對隱私比較好...

SpaceX 得到 FCC 的同意架設家用的衛星網路

在「FCC approves SpaceX’s plans to fly internet-beaming satellites in a lower orbit」這邊看到的:

The Federal Communications Commission has approved SpaceX’s request to fly a large swath of its future internet-beaming satellites at a lower orbit than originally planned.

預定打四千顆低軌道衛星上去:

Under SpaceX’s original agreement with the commission, the company had permission to launch 4,425 Starlink satellites into orbits that ranged between 1,110 to 1,325 kilometers up. But then SpaceX decided it wanted to fly 1,584 of those satellites in different orbits, thanks to what it had learned from its first two test satellites, TinTin A and B. Instead of flying them at 1,150 kilometers, the company now wants to fly them much lower at 550 kilometers.

不知道價位會落在那個區塊... 如果價位夠低的話,也許是可以考慮當作偏鄉地區的通訊方案?至少是個備用方式...

DynamoDB 也有固定的 IP address 區段了

AWS 宣佈 DynamoDB 也有固定的 IP address 區段了:「AWS specifies the IP address ranges for Amazon DynamoDB endpoints」,對於使用 IP firewall 的人可以多一些控制權。

資料可以在 https://ip-ranges.amazonaws.com/ip-ranges.json 這邊抓到,裡面 serviceDYNAMODB 的就是了。

因為沒看到 IPv6 的位置,才發現 DynamoDB 目前沒有提供 IPv6 Endpoint...

在 command line 上操作的 Termshark

看到 Termshark 這個專案,程式碼在 gcla/termshark

類似於 tshark 使用 CLI,但操作介面會比 tshark 友善不少,從說明可以看出來是透過 tshark 分析:

Note that tshark is a run-time dependency, and must be in your PATH for termshark to function. Version 1.10.2 or higher is required (approx 2013).

擋廣告的 Pi-hole

Pi-hole 最近愈來愈紅的一個計畫,技術上是透過 DNS 把不想要的網域名稱擋掉,通常就是擋掉各種 tracking 與廣告系統。

因為是透過 DNS 擋,當然沒有像 uBlock Origin 直接 parse 網頁內容來的有效,但對於方便性來說則是大勝,只需要在網路設備上設一次,所有的裝置都可以用到。

剛剛看到「How a Single Raspberry Pi made my Home Network Faster」這篇,可以看到 Pi-hole 有不錯的介面可以看 (讓你自我感覺良好?XD):

文章作者跑了一個月後,也直言還是有些東西會壞掉,需要設定一些白名單讓他動:

Review after 1 month in operation
The Pi-Hole has been running for 1 month now on my home network. I have had to whitelist 1 or 2 URLs which was blocking a reset of an Alexa which had an issue, and a video conferencing system had all sorts of tracking and metrics built in which were causing some havoc until I whitelisted them. Otherwise, the Pi has been chugging along at 8% memory utilization, and the network is considerably faster when surfing the web.

對於手癢自己玩應該還可以,拿到辦公室的話應該會有不少東西掛掉... (不過文章作者好像想這樣做)

MTR 看每個點的 AS number 或是地區資訊

跟「Mac 上讓 SSH 走 Socks5 的方式」這邊也有點關係,在泰國時測試發現 MTR 可以除了標準的 traceroute 結果外,還可以另外拉出 AS number 或是地區資訊。雖然不一定準 (因為是靠 IP address 查的),但可以很方便取得這些資料加減參考用。

-z 可以拉出 AS number (雖然 manpage 裡面不知道在搞什麼 XDDD):

       -z, --aslookup
              MISSING

另外一個是 -y,也沒寫要怎麼用,但因為是標 n 所以可以猜是數字。實際測試可以看出跟 GeoIP 套件似乎有些相關...

-y 1 是 IP network 區段 (像是 168.95.0.0/16),而 -y 2 則是地區資訊 (像是 TW 或是 US),-y 3 則是哪個 NIC 管的 (像是 apnic),-y 4 是更新日期:

       -y n, --ipinfo n
              MISSING

配合 -b 可以同時看 hostname 與 IP address,這樣資訊就蠻完整的了。另外在 Mac 上的 Homebrew 編出來的 MTR 測不出這些功能,我暫時沒花時間去追,這邊主要都是拿 Ubuntu 上的版本測試的...

Mac 上讓 SSH 走 Socks5 的方式

在泰國住的飯店提供頗快的網路:

不過到 HiNet 看起來應該是有繞到美國之類的地區?

gslin@Gea-Suans-MacBook-Pro [~] [08:16/W4] mtr --report 168.95.1.1
Start: 2019-04-07T08:16:33+0700
HOST: Gea-Suans-MacBook-Pro.local Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 10.10.20.1                 0.0%    10    1.8   2.0   1.3   3.1   0.6
  2.|-- node-iyp.pool-101-108.dyn  0.0%    10    3.9   3.6   2.7   4.5   0.6
  3.|-- 172.17.36.105              0.0%    10    3.2   4.1   3.2   8.3   1.5
  4.|-- 203.113.44.205             0.0%    10    6.5   5.3   3.9   6.7   1.0
  5.|-- 203.113.44.177             0.0%    10    5.4   4.8   4.0   7.2   1.0
  6.|-- 203.113.37.194             0.0%    10    4.6   6.5   3.0  11.1   2.5
  7.|-- in-addr.net                0.0%    10    3.9   4.4   3.1   5.6   0.8
  8.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
  9.|-- pcpd-4001.hinet.net        0.0%    10  355.4 356.9 355.1 365.4   3.0
 10.|-- pcpd-3212.hinet.net        0.0%    10  215.6 216.6 214.2 225.4   3.4
 11.|-- tpdt-3022.hinet.net        0.0%    10  219.4 215.9 214.0 221.5   2.5
 12.|-- tpdt-3012.hinet.net        0.0%    10  218.9 217.2 215.0 218.9   1.4
 13.|-- tpdb-3311.hinet.net        0.0%    10  212.5 212.9 211.9 214.1   0.6
 14.|-- 210-59-204-229.hinet-ip.h  0.0%    10  213.5 212.7 212.0 213.7   0.6
 15.|-- dns.hinet.net              0.0%    10  214.4 214.5 213.7 216.0   0.7

這樣有些影音服務只吃台灣 IP 就沒辦法用了,所以就得找方法來解決... 想法是透過我在 GCP 上開的機器繞回 HiNet,所以就得找 Mac 上 SSH 要怎麼設定 Socks5。

本來以為要用 tsocks 之類的工具 (i.e. 用 LD_PRELOAD 處理 connect()),但意外的在「SSH through a SOCKS Proxy? (client = OpenSSH OS X)」這邊看到可以用內建的 nc 處理,因為 nc 有支援 Socks5。

所以就變成兩包 ssh 指令:

ssh -D 1081 gcp.server
ssh -D 1080 -o "ProxyCommand nc -X 5 -x 127.0.0.1:1081 %h %p" hinet.server

然後 127.0.0.1:1080 就是打通的版本了,可以讓瀏覽器直接掛上去使用。

至於後來想起來不需要用 Socks5,可以用 ssh -L 而笑出來又是另外一件事情了 :o