Home » Posts tagged "information"

Trac 關票時的資訊...

Trac 關票時一般都是由開票的人關 (i.e. Reporter),今天遇到的情況是這樣,在關票時的畫面上你會不知道是誰開的:

只要往上一點拉就會看到 Reporter 資訊 XD 這通常會發生在螢幕沒打直的人,直接拉到最下方的情況 XD

試著用 CSS 的 position: sticky 但抓不太到要的效果,決定還是開 jQuery 下去解決 XD

        // Copy reporter information to action section
        if ($('#action_resolve_resolve_resolution').length > 0) {
            var reporter = $('#h_reporter').parent().find('.trac-author, .trac-author-user')[0].outerHTML;

            $('#action_resolve_resolve_resolution').each(function(){
                $(this).parent().append('<span class="hint">(This ticket is created by ' + reporter + ')');
            });
        }

當下寫得很隨性,之後也許會再改... (會更新到「Trac - Gea-Suan Lin's Wiki」這邊)

Seam Carving (接縫裁剪)

看到有人實做 Seam Carving (接縫裁剪) 了,用 Golang 寫的,放在 GitHubesimov/caire 這邊,副標題「Content aware image resize library」。實做了「Seam Carving for Content-Aware Image Resizing」這篇論文。

Seam Carving 指的是知道內容的 resize,像是把上面這張變成下面這張:

或是變大:

馬上可以想到的應用是需要保留資訊內容,但又想要大量提供資訊的地方,像是 Nuzzle 的縮圖 (或是以前的 Zite),或是網路新聞媒體的首頁所用的縮圖。不知道還有沒有其他地方可以用...

Tinder 的漏洞

在「Tinder's Lack of Encryption Lets Strangers Spy on Your Swipes」這篇看到 Tinder 的實作問題,主要是兩個問題組合起來。第一個是圖片沒有用 HTTPS 傳輸:

On Tuesday, researchers at Tel Aviv-based app security firm Checkmarx demonstrated that Tinder still lacks basic HTTPS encryption for photos.

第二個是透過 side channel information leaking:即使是 HTTPS 傳輸,還是可以透過 size 知道是哪種類型的操作:

Tinder represents a swipe left to reject a potential date, for instance, in 278 bytes. A swipe right is represented as 374 bytes, and a match rings up at 581.

這兩個資訊就可以把行為組出來了。

接下來應該會先把圖片上 HTTPS 吧?然後再來是把各種類型的 packet 都塞大包一點?

各家 Session Replay 服務對個資的處理

Session Replay 指的是重播將使用者的行為錄下來重播,市面上有很多這樣的服務,像是 User Replay 或是 SessionCam

這篇文章就是在討論這些服務在處理個資時的方式,像是信用卡卡號的內容,或是密碼的內容,這些不應該被記錄下來的資料是怎麼被處理的:「No boundaries: Exfiltration of personal data by session-replay scripts」,主要的重點在這張圖:

後面有提到目前防禦的情況,看起來目前用 adblock 類的軟體可以擋掉一些服務,但不是全部的都在列表裡。而 DNT 則是裝飾品沒人鳥過:

Two commonly used ad-blocking lists EasyList and EasyPrivacy do not block FullStory, Smartlook, or UserReplay scripts. EasyPrivacy has filter rules that block Yandex, Hotjar, ClickTale and SessionCam.

At least one of the five companies we studied (UserReplay) allows publishers to disable data collection from users who have Do Not Track (DNT) set in their browsers. We scanned the configuration settings of the Alexa top 1 million publishers using UserReplay on their homepages, and found that none of them chose to honor the DNT signal.

Improving user experience is a critical task for publishers. However it shouldn’t come at the expense of user privacy.

美國政府暗中介入好萊塢的劇本,影響大眾對戰爭的看法

透過 Freedom of Information Act (FOIA) 取得的資料顯示美國政府 (包括了五角大廈、CIA、NSA) 如何介入好萊塢,影響大眾對於戰爭的看法:「EXCLUSIVE: Documents expose how Hollywood promotes war on behalf of the Pentagon, CIA and NSA」。

灰標「US military intelligence agencies have influenced over 1,800 movies and TV shows」可以看出影響的層面。

The documents reveal for the first time the vast scale of US government control in Hollywood, including the ability to manipulate scripts or even prevent films too critical of the Pentagon from being made — not to mention influencing some of the most popular film franchises in recent years.

從很意想不到的地方介入... 引用其中一個說明:


Jon Voight in Transformers — in this scene, just after American troops have been attacked by a Decepticon robot, Pentagon Hollywood liaison Phil Strub inserted the line ‘Bring em home’, granting the military a protective, paternalistic quality, when in reality the DOD does quite the opposite.

又是 ImageMagick 出包...

ImageMagick 的 information leaking,然後 Yahoo! 很無奈的中獎,所以被稱為 Yahoobleed:「Yahoo! retires! bleeding! ImageMagick! to! kill! 0-day! vulnerability!」。發現問題的作者把問題寫在「*bleed continues: 18 byte file, $14k bounty, for leaking private Yahoo! Mail images」這邊。

作者利用 ImageMagick 的不當處理,取得 uninitialized memory 的資訊,藉以取得可能是上次轉檔的記憶體內容。而這個 jpeg 只有 18bytes (所以作者戲稱每個 byte 價值 USD$778):

A robust bounty of $14,000 was issued (for this combined with a similar issue, to be documented separately). $778 per byte -- lol!

目前的 workaround 也很簡單 (官方採用了),呼叫 ResetMagickMemory 避免 leaking (咦,這方法好像哪邊怪怪的):「Reset memory for RLE decoder (patch provided by scarybeasts)」。

利用 Side-channel 資訊判斷被 HTTPS 保護的 Netflix 影片資訊

在「Netflix found to leak information on HTTPS-protected videos」這篇看到了研究員透過 VBR 所透露出的 side channel 資訊,成功的取得了被 HTTPS 保護的 Netflix 影片資訊。這對於美國的 ISP 是個大利多 (加上之前通過的法案),但對於個人隱私則是嚴重的打擊。

這項研究的準確率非常高:

To support our analysis, we created a fingerprint database comprised of 42,027 Netflix videos. Given this collection of fingerprints, we show that our system can differentiate between videos with greater than 99.99% accuracy. Moreover, when tested against 200 random 20-minute video streams, our system identified 99.5% of the videos with the majority of the identifications occurring less than two and a half minutes into the video stream.

而且他們居然是用這樣的單機分析:

null

苦啊...

Facebook 開源的 fastText

準確度維持在同一個水準上,但是速度卻快了 n 個數量級的 text classification 工具:「FAIR open-sources fastText」。

可以看到 fastText 的執行速度跟其他方法的差距:

Our experiments show that fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation.

除了 open source 外,也發表了論文:「Enriching Word Vectors with Subword Information」,看 abstract 的時候發現提到了 Skip-gram:

In this paper, we propose a new approach based on the skip-gram model, where each word is represented as a bag of character n-grams.

結果找資料發現自己以前寫過「Skip-gram」這篇 XDDD

Archives