新墨西哥州禁止因為學童付不起營養午餐,就要求他們以勞力付出

禁止這類變相的羞辱行為:「New Mexico Outlaws School ‘Lunch Shaming’」。

In some schools, children are forced to clean cafeteria tables in front of their peers to pay the debt. Other schools require cafeteria workers to take a child’s hot food and throw it in the trash if he doesn’t have the money to pay for it.

法案包括了所有有接受補助的學校:

On Thursday, Gov. Susana Martinez signed the Hunger-Free Students’ Bill of Rights, which directs schools to work with parents to pay their debts or sign up for federal meal assistance and puts an end to practices meant to embarrass children. It applies to public, private and religious schools that receive federal subsidies for students’ breakfasts and lunches.

紐約市也將禁止雇主詢問薪資

去年麻州立法禁止雇主詢問前工作的薪資 (參考「麻州立法禁止詢問前一份工作的薪資」),而紐約市也要加入這個行列了:「New York City bans employers from asking potential workers about their past salary」。

New York City joined Massachusetts, Puerto Rico, and Philadelphia in banning employers from asking job applicants about their pay at current or past jobs after the city council passed the measure in a vote on Wednesday.

CloudFront 在印度的第三個機房

Amazon CloudFront 在印度成立第三個機房,在首都新德里 (New Delhi),前兩個是孟買 (Mumbai) 與清奈 (Chennai):「New Edge Location in New Delhi, India for Amazon CloudFront and Amazon Route 53」。

不過 CloudFront 在台灣的情況還是沒有很好,常常被導去國外 (不分 ISP),相較於其他 CDN 還是覺得不太行... :/

A Billion Taxi Rides 資料分析系列

Mark Litwintschik 最近在連載 A Billion Taxi Rides 的資料分析系列作品:

同樣的資料 (而且這個資料量夠大,拿來 benchmark 比較有參考價值),用不同的工具分析,對於要挑工具的人可以看一看,另外也因為裡面給了很多 command sample,要自己動手測試也是個很棒的資料...

美國年輕人的理想職業

紐約時報報導 National Society of High School Scholars 問了一萬八千名美國年輕人 (15~29 歲) 理想的職業,也不少出乎意料的結果跑出來:「The New Dream Jobs」。

常見的網路公司在上面,但讓紐約時報感到意外的,FBICIANSA 也在上面:

When the National Society of High School Scholars asked 18,000 Americans, ages 15 to 29, to rank their ideal future employers, the results were curious. To nobody’s surprise, Google, Apple and Facebook appeared high on the list, but so did the Central Intelligence Agency, the Federal Bureau of Investigation and the National Security Agency.

不過應該是不意外?在教育體系被灌輸愛國主義不就很容易就有這樣的結果?

Airbnb 被抓到操作站上資料以美化數據

在「How Airbnb's Data hid the Facts in New York City」這篇文章裡提到了 Airbnb 在去年 (2015 年) 十一月時操作站上資料,美化數據的證據。

Airbnb 在 2015 年 12 月時發表了一篇「Data on the Airbnb Community in NYC」,說明 Airbnb 對紐約地區的貢獻的種種之類的 PR 文章。

Airbnb 的文章裡面提到了資料是取自 2015 年 11 月 17 日的資料:

As of November 17, 2015 there were 35,966 active Airbnb listings in New York.

而作者則發現了 2015 年 11 月 17 日當天,Airbnb 站上的資料被「清理」過:

A major part of Airbnb's recent data release was a snapshot of New York City listings as of November 17, 2015. This report shows that the snapshot was photoshopped: in the days leading up to November 17, Airbnb ensured a flattering picture by carrying out a one-time targeted purge of more than 1,000 listings. The company then presented November 17 as a typical day in the company’s operations and mis-represented the one-time purge as a historical trend.

而且只針對紐約地區清理:

No similar event took place in other cities in North America or elsewhere.

完整的分析在「how_airbnbs_data_hid_the_facts_in_new_york_city.pdf」可以取得 PDF 檔,可以看到裡面同時有兩個不同資料來源的分析並確認 (Murray Cox 與 Tom Slee 所蒐集的資料)。

紐約公共圖書館放出十八萬張數位高畫質的數位資料

紐約公共圖書館這次放出了十八萬張數位資料,包括歷史照片、地圖以及信件:「The New York Public Library Lets You Download 180,000 Images in High Resolution: Historic Photographs, Maps, Letters & More」,圖書館官方的公告在「Free for All: NYPL Enhances Public Domain Collections For Sharing and Reuse」這邊:

The release of more than 180,000 digitized items represents both a simplification and an enhancement of digital access to a trove of unique and rare materials: a removal of administration fees and processes from public domain content, and also improvements to interfaces — popular and technical — to the digital assets themselves.

除了可以在「NYPL Digital Collections」這邊搜尋下載外,還有 API 可以用:「The New York Public Library Digital Collections API」,在 GitHub 上也有工具可以使用:「Digital Collections Public Domain Item Data and Tools」。

而且這 18 萬張資料是完全的開放,不需要事先取得館方授權:

No permission required, no hoops to jump through: just go forth and reuse!

將 public domain 的文物數位化,傳遞與保存變的更便利... (也讓做研究的人更容易取得資料)

紐約公共圖書館提供的 Library:將地圖 OCR 成向量資料...

紐約公共圖書館 (NYPL) 丟出個有趣的東西:「Map polygon and feature extractor」,敘述的地方就有這樣的說明:

Like OCR for maps

可以把這樣的地圖圖檔:

轉成:

這樣子... 也可以 GeoJSON 輸出 :p

這屬於 Open Data 的工作,紐約公共圖書館本身就是全世界第三大圖書館,美國第二大的圖書館 (僅次於第一的國會圖書館與第二的大英圖書館),做完後可以把館內的地圖館藏整個數據化讓人重複使用 (而非僅僅將紙本掃描成圖片資料的「電子化」),這包括了以前的手繪地圖啊...

程式主要是用 Python 寫,另外在 repository 有看到 RScheme 的存在... (GitHub 的統計)