Home » Posts tagged "word"


是一篇老文章了... (2014 年的文章,最近從其他地方提起)

這邊講的是英文,不過同樣方式也可以拿來分析其他語言:「The distribution of letters in English words」,原始文章在「Graphing the distribution of English letters towards the beginning, middle or end of words」。


The data is from the entire Brown corpus in the Natural Language Toolkit. It's a smaller and out-of-date corpus, but it's open source and easy to obtain. I repeated the analysis with COHA, the Corpus of Historical American English, a well-curated, proprietary data set from Brigham Young University for which I have a license, and the only differences were in rare letters like "z" or "x".

印 "#" 比印 "B" 來的快的問題

這篇是兩年前在 StackOverflow 上的問題:「Why is printing “B” dramatically slower than printing “#”?」。

問問題的人這段程式跑了 8.52 秒:

Random r = new Random();
for (int i = 0; i < 1000; i++) {
    for (int j = 0; j < 1000; j++) {
        if(r.nextInt(4) == 0) {
        } else {


而把上面的 # 換成 B 就變成 259.152 秒。

答案是與 word-wrapping 有關:

Pure speculation is that you're using a terminal that attempts to do word-wrapping rather than character-wrapping, and treats B as a word character but # as a non-word character. So when it reaches the end of a line and searches for a place to break the line, it sees a # almost immediately and happily breaks there; whereas with the B, it has to keep searching for longer, and may have more text to wrap (which may be expensive on some terminals, e.g., outputting backspaces, then outputting spaces to overwrite the letters being wrapped).

But that's pure speculation.

這真是細節 XDDD