另外一篇講文件掃描的...

在「Page dewarping」這篇看到講文件掃描的技術,以及 open source 的程式,對比之前提到的「Dropbox 的文件掃描功能」與「Dropbox 的 Document Detecting」的時間點,有種淡淡的惡意 XD

這篇作者是為了未婚妻的需求而寫出來的,本來是作者收到學生的作業時手動在跑,後來未婚妻也拿去用,但量愈來愈大,決定自動化處理:

A while back, I wrote a script to create PDFs from photos of hand-written text. It was nothing special – just adaptive thresholding and combining multiple images into a PDF – but it came in handy whenever a student emailed me their homework as a pile of JPEGs. After I demoed the program to my fiancée, she ended up asking me to run it from time to time on photos of archival documents for her linguistics research. This summer, she came back from the library with a number of images where the text was significantly warped due to curled pages.

So I decided to write a program that automatically turns pictures like the one on the left below to the one on the right:

程式都可以在 GitHub 上翻到:「Text page dewarping using a "cubic sheet" model」。跟 Dropbox 互別苗頭的感覺 XDDD