機器學習與情色產業的問題

Bruce Schneier 提到了最近幾個剛好相關的議題,關於機器學習在情色產業使用時遇到的隱私議題:「Technology to Out Sex Workers」。

第一個提到的是 PornHub 用機器學習辨識演員以及各種「其他資訊」,這邊引用的報導是 TechCrunch 的「PornHub uses computer vision to ID actors, acts in its videos」:

PornHub is using machine learning algorithms to identify actors in different videos, so as to better index them.

The computer vision system can identify specific actors in scenes and even identifies various positions and… attributes.

第二個提到的是花名與真實身份連在一起的問題:

People are worried that it can really identify them, by linking their stage names to their real names.

最後是提到 Facebook 已經有能力這樣做,而且已經發生了:

Facebook somehow managed to link a sex worker's clients under her fake name to her real profile.

Her sex-work identity is not on the social network at all; for it, she uses a different email address, a different phone number, and a different name. Yet earlier this year, looking at Facebook’s “People You May Know” recommendations, Leila (a name I’m using using in place of either of the names she uses) was shocked to see some of her regular sex-work clients.

這個議題與 Mass surveillance 有點像...。

微軟也推出圖片辨識的 API 了

微軟也推出類似於 Google CloudVision API 的服務了:「Microsoft Cognitive Services - Computer Vision API」。

微軟這次推出了三個功能,Analyze an image (類似於 Google Cloud 這邊的 Label Detection)、Generate a thumbnail (Google Cloud 沒有對應的功能) 與 OCR (對應到 Google Cloud 的 OCR)。

微軟的每千次都是 USD$1.5,而 Google 的 Label Detection 則貴多了 (最開始是 USD$5,到最大的量是 USD$2),不知道兩邊辨識的品質如何...

而 OCR 的部份 Google 開始是 $2.5,到最大的量是 $0.6,兩邊的定價策略也蠻有趣的。

Google Cloud Vision API

Google 推出分析圖片的服務 Google Cloud Vision API:「Google Cloud Vision API changes the way applications understand images」。

Cloud Vision API 3

分析圖片後給 tag 以及對應的分數,馬上就想到好多應用可以玩...

目前測試期不收費,之後會公佈:

There is no cost for usage of the service during the Limited Preview phase. We will introduce pricing in future phases.