Home » Archive by category "Science" (Page 3)

微軟的 Time Service 回應錯誤的時間...

看起來會有不少災情 (像是 SQL Server 遇到使用 server side 的時間的 SQL query):「Windows Time Service is sending out wrong times and that’s a big problem」,報導裡引用了 Reddit 上「PSA: time.windows.com NTP server seems to be sending out wrong time」這邊的討論串。

為了避免這種情況,不同單位會用不同方法解決。像是財力充足的 Google 就自己搞了原子鐘,然後還放 Google Public NTP 出來給大家用。可以不倚靠外部裝置確保自家時間的正確性。

另外是有人用 Raspberry Pi 收 GPS 訊號轉成 NTP service (像是「The Raspberry Pi as a Stratum-1 NTP Server」這邊介紹的方式),不過之前有發生過 GPS 送出來的時間差了 13ms 的事情,也不是完全可靠 (不過相較起來應該還是可以接受):「GPS error caused '12 hours of problems' for companies」。另外可能的方案有 GLONASS (俄羅斯的系統)。

也許之後有機會會需要自己架...

用人力就可以達到離心機的效果...

看到「This Human-Powered Paper Centrifuge Is Pure Genius」這個設計真的很巧妙... 全文刊登在 nature biomedical engineering 上:「Hand-powered ultralow-cost paper centrifuge」。

起源來自於小時候的玩具 (我也有印象,但忘記中文叫什麼了...):

Here, we report an ultralow-cost (20 cents), lightweight (2 g), human-powered paper centrifuge (which we name ‘paperfuge’) designed on the basis of a theoretical model inspired by the fundamental mechanics of an ancient whirligig (or buzzer toy; 3,300 BC).

研究後發現離心速度可以到 125000rpm:

The paperfuge achieves speeds of 125,000 r.p.m. (and equivalent centrifugal forces of 30,000 g), with theoretical limits predicting 1,000,000 r.p.m.

對於無法買昂貴醫療器材的地區,這樣就有簡單但又頗有效的離心機做檢驗...

Galileo 系統啟用

由歐盟主導的 Galileo 系統宣布啟用,提供早期服務 (Early Operational Capability):「Galileo navigation satellite system goes live」。預定的 30 顆衛星已經打了 18 顆上去:

At this point, 18 of the planned 30 satellites are already in orbit.

在一般的使用下精確度可以到 4 公尺,相較於 GPS 是 15 公尺高出不少:

Using GPS, private users can navigate with a precision of up to 15 meters (m). Galileo offers a precision of up to 4m for its fully open service.

而商用與軍用可以到公分等級:

Commercial users and official government services can even receive a precision of a few centimeters. This is important, for example, for fully or partially automated planes, cars or ships.

之後應該會有同時支援兩套系統的設備出來... 手機應該也會有?

玩 Python 下的 ggplot

在「A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair)」這邊又再次看到 Python 下的 ggplot,以為還算好裝,但實際上好像有點難裝 XD

我平常用的環境是 pyenv 跑 Python 3.5.2。而跑 ggplot 需要用到 _tkinter,這個模組,而這個模組在 Python 3 應該是內建的... 只要你有先裝 tk-dev @_@

所以在弄了半天發現這個問題後,先把 tk-dev 補裝上,再重新安裝 Python 3.5.2:

$ sudo apt-get install tk-dev
$ pyenv install -f 3.5.2
$ pip install -U ggplot

裝好後發現網路上一般建議的寫法好像不會動,又摸了一陣子後發現現在變得物件化了,要改成這樣的方式把檔案存起來:

p = ggplot(...) + ...
p.save('a.png')

另外資料的物件要透過 DataFrame 產生出來,反正不少枚枚角角的細節要了解後才知道怎麼用 @_@

Anyway,程式碼可以在 population-taiwan.py 這邊翻到,人口資料則是從中文維基百科的「臺灣人口普查」這邊拉出來的,最後產生出來的圖片會是這樣:

算是牛刀小試... 話說 theme_xkcd() 效果頗不賴 XDDD

不打開書直接掃描內容

MIT Media Lab 弄出個好玩的東西,可以不打開書直接掃描書的內容:「Can computers read through a book page by page without opening it?」,主標題是「Terahertz time-gated spectral imaging for content extraction through layered structures」。

用 100Ghz 到 3Thz 的電磁波掃描:

In our new study we explore a range of frequencies from 100 Gigahertz to 3 Terahertz (THz) which can penetrate through paper and many other materials.

先前也有類似的方法,用 X-ray 或是超音波,但效果都不好:

Can’t X-ray or ultrasound do this? It may seem that X-ray or ultrasound can also image through a book; however, such techniques lack the contrast of our THz approach for submicron pen or pencil layers compared next to blank paper. These methods have additional drawbacks like cost and ionizing radiation. So while you might be able to hardly detect pages of a closed book if you use a CT scan, you will not be able to see the text. Ultrasound does not have the resolution to detect 20 micron gaps in between the pages of a closed book -distinguishing the ink layers from the blank paper is out of the question for ultrasound. Based on the paper absorption spectrum, we believe that far infrared time resolved systems and THz time domain systems might be the only suitable candidates for investigating paper stacks page by page.

不知道可以進展做到什麼程度,目前只是「能看懂」的程度,品質看起來還是不太夠:

Facebook 開源的 fastText

準確度維持在同一個水準上,但是速度卻快了 n 個數量級的 text classification 工具:「FAIR open-sources fastText」。

可以看到 fastText 的執行速度跟其他方法的差距:

Our experiments show that fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation.

除了 open source 外,也發表了論文:「Enriching Word Vectors with Subword Information」,看 abstract 的時候發現提到了 Skip-gram:

In this paper, we propose a new approach based on the skip-gram model, where each word is represented as a bag of character n-grams.

結果找資料發現自己以前寫過「Skip-gram」這篇 XDDD

另外一篇講文件掃描的...

在「Page dewarping」這篇看到講文件掃描的技術,以及 open source 的程式,對比之前提到的「Dropbox 的文件掃描功能」與「Dropbox 的 Document Detecting」的時間點,有種淡淡的惡意 XD

這篇作者是為了未婚妻的需求而寫出來的,本來是作者收到學生的作業時手動在跑,後來未婚妻也拿去用,但量愈來愈大,決定自動化處理:

A while back, I wrote a script to create PDFs from photos of hand-written text. It was nothing special – just adaptive thresholding and combining multiple images into a PDF – but it came in handy whenever a student emailed me their homework as a pile of JPEGs. After I demoed the program to my fiancée, she ended up asking me to run it from time to time on photos of archival documents for her linguistics research. This summer, she came back from the library with a number of images where the text was significantly warped due to curled pages.

So I decided to write a program that automatically turns pictures like the one on the left below to the one on the right:

程式都可以在 GitHub 上翻到:「Text page dewarping using a "cubic sheet" model」。跟 Dropbox 互別苗頭的感覺 XDDD

Dropbox 的文件掃描功能

算是講 Dropbox 的「Dropbox 的 Document Detecting」這篇的續集,在抓出文件位置後講顏色的校準:「Fast Document Rectification and Enhancement」。

要怎麼把左邊的原始圖轉換成右邊的圖,包括了座標轉換以及顏色校準:

顏色校準的部份講到了這張很有名的圖。在圖片上,A 與 B 的區塊顏色是相同的,但你校準出來的時候必須跟人腦的感覺相同:

Here’s a great illustration of this “illusion,” in which the two tiles marked A and B have the same pixel values, but appear to be very different:

這是最後的成果,左邊是原始圖,中間是將背景改成白色,其他顏色保留,而右邊則是試著修正顏色:

Left: the original image. Middle: an enhanced image, with the background becoming white and the foreground preserved in exact R, G, B values. Note that the colors appear faded. Right: an enhanced image that tries to correct for the perceptual discrepancy.

應該是在 Dropbox 裡面的專案,是個有不少數學可以玩的專案...

Archives