Google Docs 裡 Grammar Correction 的 bug

剛剛在 Hacker News 上看到有趣的 bug,在 Google Docs 上輸入 And. And. And. And. And. 會觸發 error:「Including “And. And. And. And. And.” in a Google doc causes it to crash (」,原始的 bug report 在「Including "And. And. And. And. And." in a Google doc causes it to crash.」這邊,錯誤訊息像是這樣:

Hacker News 上的討論有提到這需要開 grammar check 的功能,然後看起來只要有相同的五個字開頭都大寫就會發生,像是 Also, Therefore, And, Anyway, But, Who, Why. 這些:

Also, Therefore, And, Anyway, But, Who, Why.

Each in caps 5 times with the same word with a period and space after each word and newline at the end is what I have found so far.

Can anyone find others?

Edit: added words that work found in other comments

很有趣的 bug XDDD 然後目前在 Hacker News 首頁的第一名...

Facebook 修正錯字的新演算法

先前 Facebook 已經先發表過 fastText 了,在這個月的月初又發表了另外一個演算法 Misspelling Oblivious Embeddings (MOE),是搭著本來的 fastText 而得到的改善:「A new model for word embeddings that are resilient to misspellings」。

Facebook 的說明提到在 user-generated text 的內容上,MOE 的效果比 fastText 好:

We checked the effectiveness of this approach considering different intrinsic and extrinsic tasks, and found that MOE outperforms fastText for user-generated text.

論文發表在 arXiv 上:「Misspelling Oblivious Word Embeddings」。

依照介紹,fastText 的重點在於 semantic loss,而 MOE 則多了 spell correction loss:

The loss function of fastText aims to more closely embed words that occur in the same context. We call this semantic loss. In addition to the semantic loss, MOE also considers an additional supervisedloss that we call spell correction loss. The spell correction loss aims to embed misspellings close to their correct versions by minimizing the weighted sum of semantic loss and spell correction loss.

不過目前 GitHub 上的 facebookresearch/moe 只有放 dataset,沒有 open source 出來讓人直接用,可能得自己刻...

NASA 用雷射光傳送「蒙娜麗莎」圖片到月球上...

NASA 用雷射光將灰階「蒙娜麗莎」傳送到月球軌道上的 LOLA (Lunar Orbiter Laser Altimeter,看起來是台描繪地表用的儀器?):「NASA Beams Mona Lisa to Lunar Reconnaissance Orbiter at the Moon」。

Lunar Orbiter Laser Altimeter

傳輸速度是 300bits/sec (這數字讓人真感動 XD),傳送 152x200 的 4096 灰階圖片。這次測試其中一個目的是了解地球大氣層對光訊號的影響,可以看到左邊的 raw data 與右邊靠 2/3 RS code 修正的結果:

To clean up transmission errors introduced by Earth's atmosphere (left), Goddard scientists applied Reed-Solomon error correction (right), which is commonly used in CDs and DVDs. Typical errors include missing pixels (white) and false signals (black). The white stripe indicates a brief period when transmission was paused.

不過好像沒提到用多大台的雷射打到月球上... (雷射砲?)