更激進的考慮使用者會混淆的問題

前陣子寫的「UUID 的 UX」考慮到了人眼會把 0Oo 以及 1IiLl 看錯的問題,這篇則是更激進的想辦法去避免類似的問題:「Understanding and avoiding visually ambiguous characters in IDs」,對應的討論可以在「Understanding and avoiding visually ambiguous characters in IDs (gajus.com)」這邊看到。

作者有提到,這是用在人類需要寫下或是溝通時,避免錯誤的發生:

Any time that the ID might need to be communicated verbally or written down[.]

這就不只是前面提到的 0Oo1IiLl 問題了,包括看起來有機會誤會的字,像是 2Z 以及 8B 這種也要避開。

如果是大小寫都放進去的話是 53 個字可以用,但如果希望大小寫意思一樣的話就只剩下 22 個字可以用了:

Assuming that you are going with case sensitivity, you have 53 characters to choose from (adjusted for visually ambiguous characters). On the other hand, if you decide to make your IDs case-insensitive, you have only 22 characters to choose from.

他給出了這 22 個字:

[
  "a",
  "b",
  "c",
  "d",
  "e",
  "f",
  "h",
  "i",
  "j",
  "k",
  "m",
  "n",
  "o",
  "p",
  "r",
  "s",
  "t",
  "w",
  "x",
  "y",
  "3",
  "4"
]

後續還提到 rnm,以及 vvw 的相似問題,不過這邊的 generator 就更難搞了...

搞到這樣,乾脆用數字就好?使用者的 UX 也比較好?

各種特殊符號的英文

Hacker News 首頁上看到「Pronunciation guide for UNIX」這個列表,裡面有各種特殊符號的英文。

裡面列的真的比較簡易,舉例來說,像是他對 - 的說明是:

但如果你查 Hyphen (連字號) 與 Dash (連接號) 的定義,都會指出者兩個東西是不一樣的東西:

The hyphen is sometimes confused with dashes (figure dash ‒, en dash –, em dash —, horizontal bar ―), which are longer and have different uses, or with the minus sign −, which is also longer and more vertically centred in some typefaces.

The dash is a punctuation mark consisting of a long horizontal line. It is similar in appearance to the hyphen but is longer and sometimes higher from the baseline.

裡面的說明用在口語上表達應該還行,但如果是真的在寫比較正式的資料的時候還是要去其他地方確認...

Twitter 打算放寬到 280 字...

Twitter 打算放寬 140 字限制:「Giving you more characters to express yourself」。

不過不包括日文、中文與韓文 XD

We want every person around the world to easily express themselves on Twitter, so we're doing something new: we're going to try out a longer limit, 280 characters, in languages impacted by cramming (which is all except Japanese, Chinese, and Korean).

然後也拿日文與英文當範例:

然後做了比較:

印 "#" 比印 "B" 來的快的問題

這篇是兩年前在 StackOverflow 上的問題:「Why is printing “B” dramatically slower than printing “#”?」。

問問題的人這段程式跑了 8.52 秒:

Random r = new Random();
for (int i = 0; i < 1000; i++) {
    for (int j = 0; j < 1000; j++) {
        if(r.nextInt(4) == 0) {
            System.out.print("O");
        } else {
            System.out.print("#");
        }
    }

   System.out.println("");
 }

而把上面的 # 換成 B 就變成 259.152 秒。

答案是與 word-wrapping 有關:

Pure speculation is that you're using a terminal that attempts to do word-wrapping rather than character-wrapping, and treats B as a word character but # as a non-word character. So when it reaches the end of a line and searches for a place to break the line, it sees a # almost immediately and happily breaks there; whereas with the B, it has to keep searching for longer, and may have more text to wrap (which may be expensive on some terminals, e.g., outputting backspaces, then outputting spaces to overwrite the letters being wrapped).

But that's pure speculation.

這真是細節 XDDD

Twitter 宣佈要放寬 140 字限制...

好像跟當初外面傳言的不太一樣... Anyway,Twitter 宣佈放寬 140 字限制:「Coming soon: express even more in 140 characters」。

這個限制的解除一直都有傳言,不過最後出來的結果跟預期的好像不太一樣,主要是三種用法將不計算在 140 字內。分別是 reply 時的 @username、貼圖貼影片時的 url、引用 tweet 時被引用的文字。

所以並不是完全放寬 140 字限制,只是把某些計算方式放寬...

PuTTY 安全性問題 (CVE-2015-5309)

雖然很久沒用 PuTTY 了 (因為用 Ubuntu 很久了),不過很難得看到 PuTTY 有安全性問題。

PuTTY 官方發佈了安全性通報 CVE-2015-5309:「PuTTY vulnerability vuln-ech-overflow」:

Versions of PuTTY and pterm between 0.54 and 0.65 inclusive have a potentially memory-corrupting integer overflow in the handling of the ECH (erase characters) control sequence in the terminal emulator.

不過老問題還是沒解啊,透過 HTTPS (i.e. Certificate authority 架構) 雖然有很多問題,但至少還是個靠稽核制度而建立的安全信任機制,在沒有任何可信任環境下可以當作起點下仍然是最好的方案:「如何安全下載軟體...」。

圖片上的文字辨識:Project Naptha

把圖片上的文字辨識直接做成 Google Chrome 的延伸套件,預設就辨識好後讓你可以直接選取:「Project Naptha」。

這是官方提供的範例:

一張含有文字的圖片可以直接 OCR 出來變成文字選擇。

官方網站上有說,這是 client-side javascript:

One of the more impressive things about this project is the fact that it's almost entirely written in client side javascript. That means that it's pretty much totally functional without access to a remote server.

不過預設會傳回去,但可以關掉:

By default, when you begin selecting text, it sends a secure HTTPS request which lacks any kind of identifiable information to the Project Naptha cached remote OCR and Translation service. This allows you to recognize text from an image with much more accuracy than otherwise possible. However, this can be disabled simply by checking the "Disable Lookup" item under the Options menu.

也就是這個選項:

這功能好讚...