在某些新聞報導透漏出了受害者的某些背景身份,於是你手上有了這兩個資料:
- 符合這些背景身份的四十個人的照片。
- 人臉被馬賽克後的新聞照片。
現在的問題是,要怎麼判斷出新聞照片裡是哪個人:「Defeating Image Obfuscation with Deep Learning」。
類似這樣的實驗,從 40 個人中找出正確的人,有 50% 的正確率:

也許 50% 不算到能用的程度,但這代表老大哥的技術已經在發展了...
幹壞事是進步最大的原動力
在某些新聞報導透漏出了受害者的某些背景身份,於是你手上有了這兩個資料:
現在的問題是,要怎麼判斷出新聞照片裡是哪個人:「Defeating Image Obfuscation with Deep Learning」。
類似這樣的實驗,從 40 個人中找出正確的人,有 50% 的正確率:
也許 50% 不算到能用的程度,但這代表老大哥的技術已經在發展了...
在 Hacker News Daily 上看到的方法,作者利用機器學習的方法試著找出那些因素導致他變胖,然後再規劃減肥計畫:「Discovering ketosis: how to effectively lose weight」,文章有點長,講重點。
首先作者把每天的體重與行為記錄起來,像是這樣:
# # -- Comment lines (ignored) # Date,MorningWeight,YesterdayFactors 2012-06-10,185.0, 2012-06-11,182.6,salad sleep bacon cheese tea halfnhalf icecream 2012-06-12,181.0,sleep egg 2012-06-13,183.6,mottsfruitsnack:2 pizza:0.5 bread:0.5 date:3 dietsnapple splenda milk nosleep 2012-06-14,183.6,coffeecandy:2 egg mayo cheese:2 rice meat bread:0.5 peanut:0.4 2012-06-15,183.4,meat sugarlesscandy salad cherry:4 bread:0 dietsnapple:0.5 egg mayo oliveoil 2012-06-16,183.6,caprise bread grape:0.2 pasadena sugaryogurt dietsnapple:0.5 peanut:0.4 hotdog 2012-06-17,182.6,grape meat pistachio:5 peanut:5 cheese sorbet:5 orangejuice:2 # and so on ...
當時只是記錄,並沒有刻意減肥:
I was not dieting at that time. Just collecting data.
剩下的就跑分析直接拉出哪些行為的幫助最大,於是就有這張圖了:
Humble Bundle 說明他們如何對抗信用卡盜刷的方法,主要是不斷的降低風險,然後讓人介入的機會降低 (因為人事成本很高):「How Humble Bundle stops online fraud」。
其中第一點是特別想提的:
Our first line of defense is a machine-learning-based anti-abuse startup called Sift Science, which we’ve been training for years across 55,000,000 transactions. Given how many orders we process, Sift Science has a really good idea when someone is up to no good. The model adapts daily as we get more data.
Sift Science 在 2014 的時候提過:「偵測信用卡交易是否為盜刷的服務」。做的事情很簡單,你把大量的資料傳給 Sift Science,包括了各種使用者身份資訊,以及信用卡資料,Sift Science 可以透過 Machine Learning 的方法告訴你這筆交易的風險,讓你進一步的判斷。
其實不少家都有做類似的服務,像是 MaxMind 的 minFraud (就是做 GeoIP database 很有名的那家公司的另外一個產品)。當交易量很大的時候是個很有趣的應用,降低處理盜刷後續處理的成本。
在 Bruce Schneier 這邊看到「Facebook Using Physical Location to Suggest Friends」這則文章,引用自「Facebook is using your phone’s location to suggest new friends—which could be a privacy disaster」這篇報導,報導開頭寫著更新的資訊:
Update (June 28): After twice confirming it used location to suggest new friends, Facebook now says it doesn’t currently use “location data, such as device location and location information you add to your profile, to suggest people you may know.” The company says it ran a brief test using location last year. New story here.
跟 Facebook 第二次確認後發現是標準的「啊!靠腰!是 PR 災難」的處理方式。在第一次跟 Facebook 確認時,Facebook 發言人的正式回覆說明了手機的位置是計算的條件之一:
“People You May Know are people on Facebook that you might know,” a Facebook spokesperson said. “We show you people based on mutual friends, work and education information, networks you’re part of, contacts you’ve imported and many other factors.”
One of those factors is smartphone location. A Facebook spokesperson said though that shared location alone would not result in a friend suggestion, saying that the two parents must have had something else in common, such as overlapping networks.
“Location information by itself doesn’t indicate that two people might be friends,” said the Facebook spokesperson. “That’s why location is only one of the factors we use to suggest people you may know.”
靠背...
Google Compute Engine 推出了可以自己設定 CPU 與 RAM 的機器種類:「Custom Machine Types - Compute Engine — Google Cloud Platform」。
可以從 1 個 vCPU 到 32 個 vCPU,而記憶體最多是 6.5GB * vCPU 數,所以理論上最高是 208GB?
Create a machine type with as little as 1 vCPU and up to 32 vCPUs, or any even number of vCPUs in between. Memory can be configured up to 6.5 GB of RAM per vCPU.
計價方式就是 vCPU 算一份,記憶體算一份。記得以前有比較小的 Cloud Service 有提供過類似的計價方式,後來都收掉了...
在「The International Space Station (Finally) Gets an Espresso Machine」這篇看到的,原始的報導出自「The International Space Station (finally!) gets an espresso machine」。
幾個重點 XDDD
The ISSpresso requires 120V DC power which is obtained at the Utility Outlet Panel (UOP) on the ISS.
很特別的電力,是 120V DC 而非 120V AC...
有人對 VMware 以及 VirtualBox 測試效能:「Comparing Filesystem Performance in Virtual Machines」,有四張圖:
不過數字怪怪的,NFS 那邊是怎樣 XDDD
看看就好,如果有用到再自己測試看看吧?
在「推薦系統的課程...」這篇推薦了 2013 秋天的課程,不過整理舊文章的時候發現有另外一門講 Machine Learning 的課程已經結束 (不僅僅是 Recommendation System)。
CMU 的課程:「Introduction to Machine Learning」,課程都有 PDF slide 與錄影可以看。
相較於推薦系統,機器學習的課程比較篇理論,而且也比較廣泛,而推薦系統比較偏應用。