Elsevier 限制加州大學的存取權限

三月的時候加州大學系統 (UC) 因為 Elsevier 不接受 open access 的條件而公開宣佈不續約 (參考「加州大學宣佈不與 Elsevier 續約」),後來 Elsevier 應該是試著看看有沒有機會繼續合作,所以在這段期間還是一直提供服務給加州大學系統。

前幾天在 Hacker News 上看到「Elsevier cuts off UC’s access to its academic journals (latimes.com)」,總算是確定要動手了:「In act of brinkmanship, a big publisher cuts off UC’s access to its academic journals」。

不過也不是直接拔掉,而是限制存取權,看不到新東西 (以 2019/01/01 為界):

As of Wednesday, Elsevier cut off access by UC faculty, staff and students to articles published since Jan. 1 in 2,500 Elsevier journals, including respected medical publications such as Cell and the Lancet and a host of engineering and scientific journals. Access to most material published in 2018 and earlier remains in force.

UC 提出的商業模式是讓投稿者負擔費用,而存取者不需要負擔,與現有的商業模式剛好相反。UC 提出的模式鼓勵「知識的散佈」,而現有的商業模式則是反過來,希望透過知識的散佈而賺~大~錢~發~大~財~:

UC demanded that the new contract reflect the principle of open access — that work produced on its campuses be available to all outside readers, for free.

That was a direct challenge to the business model of Elsevier and other big academic publishers. Traditionally, the publishers accept papers for publication for free but charge steep subscription fees. UC is determined to operate under an alternative model, in which researchers pay to have their papers published but not for subscriptions.

另外在 Hacker News 上的 comment 裡看到一些專案也正在進行,像是歐洲的「Plan S」也是在推動 open access:

The plan requires scientists and researchers who benefit from state-funded research organisations and institutions to publish their work in open repositories or in journals that are available to all by 2021.

另外「PubPub · Community Publishing」也是 open source 領域裡蠻有趣的計畫,後面看起來也有不少學術單位在支持。

Cloudflare 推出 Spectrum:65535 個 TCP Port 都可以轉的 Proxy...

Cloudflare 推出了 Spectrum,文章標題提到的 65533 應該是指 80 & 443 以外其他的 port:「Introducing Spectrum: Extending Cloudflare To 65,533 More Ports」。

然後因為 TCP proxy 不像 HTTP proxy 與 WebSocket proxy 可以靠 Host header 資訊判斷,在 TCP proxy 需要獨占 IP address 使用 (i.e. 一個 IP address 只能給一個客戶用),而因為 IPv4 address 不夠的關係,這個功能只開放給 Enterprise 客戶用:

Today we are introducing Spectrum, which brings Cloudflare’s security and acceleration to the whole spectrum of TCP ports and protocols for our Enterprise customers.

雖然現在限定在 Enterprise 客戶,但 Cloudflare 還是希望看看有沒有其他想法,目前提出來的選項包括了開放 IPv6 address 給所有人用,或是變成獨立付費項目:

Why just Enterprise? While HTTP can use the Host header to identify services, TCP relies on each service having a unique IP address in order to identify it. Since IPv4 addresses are endangered, it’s quite expensive for us to delegate an IP per application and we needed to limit use. We’re actively thinking about ways to bring Spectrum to everyone. One idea is to offer IPv6-only Spectrum to non-Enterprise customers. Another idea is let anyone use Spectrum but pay for the IPv4 address. We’re not sure yet, but if you prefer one to the other, feel free to comment and let us know.

類似的產品應該是 clean pipe 類的服務,但一般 clean pipe 是透過 routing 重導清洗流量,而非像 Cloudflare 這樣設計... 不知道後續會有什麼樣的變化。

2011 年的研究,開放辦公室與病假的關聯性

忘記從哪邊冒出來的連結,反正是個 2011 年的研究:「Sickness absence associated with shared and open-plan offices--a national cross sectional questionnaire survey.」。2011 年在丹麥的研究:

METHODS: The analysis was based on a national survey of Danish inhabitants between 18-59 years of age (response rate 62%), and the study population consisted of the 2403 employees that reported working in offices. The different types of offices were characterized according to self-reported number of occupants in the space. The log-linear Poisson model was used to model the number of self-reported sickness absence days depending on the type of office; the analysis was adjusted for age, gender, socioeconomic status, body mass index, alcohol consumption, smoking habits, and physical activity during leisure time.

都是與 cellular office 比較,可以看出大於六個人的開放辦公室病假的量高出許多:

RESULTS: Sickness absence was significantly related to having a greater number of occupants in the office (P<0.001) when adjusting for confounders. Compared to cellular offices, occupants in 2-person offices had 50% more days of sickness absence [rate ratio (RR) 1.50, 95% confidence interval (95% CI) 1.13-1.98], occupants in 3-6-person offices had 36% more days of sickness absence (RR 1.36, 95% CI 1.08-1.73), and occupants in open-plan offices (>6 persons) had 62% more days of sickness absence (RR 1.62, 95% CI 1.30-2.02).

CONCLUSION: Occupants sharing an office and occupants in open-plan offices (>6 occupants) had significantly more days of sickness absence than occupants in cellular offices.

看起來只是拉數字出來分析... 另外信心區間的洞好大 XD

VaultPress 的新方案

VaultPressWordPress 的付費服務,可以備份自己架設的 WordPress 站台。

剛剛看到新的方案出爐了:「Announcing Streamlined Plans — at Lower Prices」,Jetpack Personal 將本來的 VaultPress Lite 包在內,但是價錢更低了:

At $3.50 per month, the Jetpack Personal plan includes everything the old VaultPress Lite plan used to — at a price that’s 30% lower.

有在用的人記得進去更改方案,另外要注意生效時間,等原來 Lite 快到期再改。

AWS 的 Developer Support Plan 漲價

AWS Support 中的 Developer Support Plan 漲價了:「AWS Support announces update to Developer Support plan」。

本來是固定 $49/month,現在變成 $29/month 或是 3% (取高的,也就是超過 USD$966.66/month 後要以 3% 計算):

A new pricing model for our Developer Support plan has been launched, reducing the entry cost from $49 per month to $29 per month, while providing the same level of customer service and support. As of July 26th, 2016 all new AWS accounts subscribing to Developer Support will receive the new pricing, set at the greater of $29 or 3% of monthly AWS spend.

大概是因為很多 production 用戶都只申請這個吧 (只是為了開 ticket 而已),不過這邊只提到「all new AWS accounts subscribing to Developer Support」,沒提到舊的用戶會怎麼處理...

開放式辦公室 (Open Plan Office) 的問題

Open plan 的兩種型態,取自維基百科條目:

不過在 IT 產業實做後效果一直是個疑問。而 2014 年年初時,有人寫文章拿出來討論,並且給出負面的結論:「The Open-Office Trap」。


In 2011, the organizational psychologist Matthew Davis reviewed more than a hundred studies about office environments. He found that, though open offices often fostered a symbolic sense of organizational mission, making employees feel like part of a more laid-back, innovative enterprise, they were damaging to the workers’ attention spans, productivity, creative thinking, and satisfaction. Compared with standard offices, employees experienced more uncontrolled interactions, higher levels of stress, and lower levels of concentration and motivation. When David Craig surveyed some thirty-eight thousand workers, he found that interruptions by colleagues were detrimental to productivity, and that the more senior the employee, the worse she fared.


An open environment may even have a negative impact on our health. In a recent study of more than twenty-four hundred employees in Denmark, Jan Pejtersen and his colleagues found that as the number of people working in a single room went up, the number of employees who took sick leave increased apace. Workers in two-person offices took an average of fifty per cent more sick leave than those in single offices, while those who worked in fully open offices were out an average of sixty-two per cent more.

而最近 (2014 年年底) 又被提出來,基本上是引用 2014 年年初那篇文章的負面分析:「Google got it wrong. The open-office trend is destroying the workplace.」。

文章裡提出幾個 workaround (因為辦公室改建的成本...),包括建立 private area:

For one, they should create more private areas — ones without fishbowl windows.


For instance, when a colleague has on headphones, it’s a sign that you should come back another time or just send an e-mail.


On the other hand, companies could simply join another trend — allowing employees to work from home. That model has proven to boost productivity, with employees working more hours and taking fewer breaks. On top of that, there are fewer interruptions when employees work remotely.

不過這些方法偏向 workaround,如果辦公室可以在規劃時就避開的話會更好,像是這樣:

Puffin Browser - CloudMosa - 辦公室座位區 @ wens的相簿 :: 痞客邦 PIXNET ::

T-Mobile 提供國際漫遊單一費率...

T-Mobile 將要推出國際漫遊單一費率的資費方案:「T-Mobile to offer free unlimited international data, texts」。


T-Mobile's latest shake-up puts international roaming rates in its sights, with the carrier eliminating fees for data and text messages entirely in more than 100 countries. It will also simplify the calling rates, charging a fixed rate of 20 cents per minute.

語音的部份,每分鐘新台幣 6 元?

不過 unlimited international data 是有速度限制的:

Chief Marketing Officer Mike Sievert said the average speed customers would get would be around 128 kilobits a second.


T-Mobile hopes to make some of its money back with "speed packs" that customers can purchase on the fly to boost their connection speed temporarily. For $15, a customer gets a single day's worth of high-speed data up to 100 megabytes, while $20 gets one week at 200 megabytes and $50 gets two weeks and 500 megabytes. The speeds are more accustomed to T-Mobile's HSPA+ networks, since international roaming with LTE isn't broadly available.

在官方網站的「Simple Choice International Plans | Unlimited Data & Text | T-Mobile」上面寫的更清楚:

Only T-Mobile’s network can give you unlimited everything for everyone. Other carriers may even make you share minutes, messages, and data between you and your family. With the Simple Choice Plan, each line comes with unlimited talk, text, and data while on our home network—and starting October 20, unlimited data and text in over 100 countries at no extra charge. That means no overages just about anywhere you go. Plus, there's no annual service contract and it's easy to add additional lines.



非常經典的 UTF-8...

Hacker News 文摘上看到「UTF-8 – “The most elegant hack”」這篇。除了維基百科上的資料以外,Rob Pike 與其他人在 2003 年寫的 mail 也是相當重要的資料。

Ken Thompson 與 Rob Pike 兩位發展出來的 UTF-8 被譽為最優雅的 hack 真的一點都不為過。Unicode 1.0 在 1991 年 10 月公佈。之後就陸陸續續有表示的格式出來...

相容於 ASCII 0-127 的 UTF-1 在 1992 年被提出來,但 parsing performance 並不好。

1992 年 7 月,Dave Prosser 提出 FSS-UTF,很類似後來的 UTF-8 但缺乏 self-synchronizing 特性 (這個特性指的是,從字串中間可以很容易找到切割點)。

1992 年 8 月,Ken Thompson 改善了 FSS-UTF,讓 bit 使用效率低一點,但因此擁有 self-synchronizing 特性。之後在 1992 年 9 月,Rob Pike 與 Ken Thompson 將 UTF-8 實做到 Plan 9 上。而 UTF-8 正式公開發表則是在 1993 年 1 月的 USENIX 上。

UTF-8 的設計看起來很 hack,但卻有這些優美的特性:


只包含 ASCII 0-127 的字串是合法的 UTF-8 字串。

重點是 0 被保留下來,也就是本來的 NULL-terminated 字串處理全部都可以沿用,這使得從 C 語言的 strcpy(),到一堆網路上已經跑很久的通訊協定,都可以繼續沿用。


UTF-8 很容易被判斷出來,引用維基百科的數字:

The probability of a random string of bytes which is not pure ASCII being valid UTF-8 is 3.9% for a two-byte sequence, and decreases exponentially for longer sequences.

非 ASCII 字串只要稍微有長度 (四個中文字,12 bytes?),判斷字串是否為 UTF-8 的正確性應該跟各種服務的 SLA 有得拼...

與 Unicode 的順序對應相容

Unicode 的編號順序與 UTF-8 相容,也就是說連傳統的 strcmp() 都可以直接拿來用:

Sorting a set of UTF-8 encoded strings as strings of unsigned bytes yields the same order as sorting the corresponding Unicode strings lexicographically by codepoint.

避開 UTF-16 的 BOM

BOM 的 0xFE 與 0xFF 在合法 UTF-8 文件裡是看不到的,所以如果開頭有看到 BOM 時一定不是 UTF-8:

The bytes 0xFE and 0xFF do not appear, so a valid UTF-8 stream never matches the UTF-16 byte order mark and thus cannot be confused with it.

self-synchronizing 特性

由於 encoding 的特性,UTF-8 字串要找下一個斷點是很容易的:

找到符合這六種開頭的 string pattern 就是斷點。也因為如此,容錯率相當高。

可以容納所有 Unicode 字元

也因為 encoding 特性,UTF-8 理論值可以容納百萬個字元 (依照 RFC3629 的額外限制,是 1112064 個)。在還沒有找到很多外星文明之前,應該都還夠用。(2012 年發佈的 Unicode 6.2 也才十一萬個字元,110182 個字元)

Unicode 與 UTF-8 之間的轉換很方便

再次因為 encoding 特性,轉換幾乎是 bit 運算就可以操作完畢。(注意 Last code point 的值都切齊)


這是一個幾乎找不到缺點的 standard,所以早期很多 programmer 選擇的原因是「看了就喜歡」,於是就有大量的 library。接下來有大量的 standard (這還包括 XML standard) 直接挑明講 UTF-8 的處理能力是必要條件。


UTF-8 encoding 怎麼看都很 hack (看起來很隨意的把不同 Unicode 區段切割到不定寬度字集內,感覺不到特別的處理),但卻很完美的解決了「如果可以處理 8bits 時,要與現有系統相容」的問題。也因為這個 encoding 把問題解決得很乾淨 (UTF-8 解決不了的,其他人都解不乾淨),於是就變成超級主流 encodoing...

Google 在 2012 年 2 月時就寫過一篇「Unicode over 60 percent of the web」,這還是扣掉 ASCII 的 20%!

現在是 2013 年快年尾了,可以預期之後是 UTF-8 萬萬歲了...

如果想要了解更細,可以參考維基百科的「Comparison of Unicode encodings」,裡面有與其他 Unicode 格式的比較。

Hosting Plan

2009 年在 Ptt 寫的文章有提到不少 hosting plan,2012 年現在有不少變化...

VPS 的部份,Linode 的 CPU 一向是 C/P 值超漂亮的方案。最小的方案從 384MB RAM 變成 512MB RAM (價錢沒變,仍然約 USD$20/month),在有 swap 的情況下,即使跑 ApacheMySQL 也應該還算堪用。

Linode 另外一個改變是多了東京機房。東京機房與台灣各 ISP 之間的 latency 都相當好。當以台灣使用者為主要族群而挑選 VPS 時,在東京的 Linode 主機通常是個好選擇。如果在 Linode 上面有 High Availability 需求,也有 NodeBalancer 可以用 (2011 年,Introducing NodeBalancer)。

Linode 已經算是不錯的方案,但如果你對 latency 非常重視 (日本與台灣之間大約有 30ms 的 latency) 而一定要使用台灣內部的 VPS (像是遊戲的伺服器),那麼中華的 HiCloud 會是一個可以考慮的方案。

PaaS 的部份,目前比較有名的是 Heroku,支援的語言多,而且有 free quota,對於一些小站台或是測試應用應該是沒什麼問題,不過目前只有美東 us-east-1 機房 (Heroku 用 AWS 為底層)。另外 AWS Elastic Beanstalk 的方案也可以看看,目前支援 JavaPHP 兩種語言,區域比較多。

如果是 IaaS 的部份,AWS 能提供的功能比較多,AWS 東京機房對台灣的速度也不差。另外 AWS S3 在雲端靜態儲存這個領域還是領先者,就算用 VPS 或是 Dedicated Hosting 也還是可以考慮把一些東西放到 S3 上。

Dedicated Hosting 的部份,目前還會選的是美西的 Energy Group Networks (EGIHosting) 以及 Limestone Networks。需要大頻寬的時候可以到 Dedicated Servers 這頁翻翻,或是寫信跟 EGIHosting 要 quote,當你 commit 1Gbps 含機器的價錢大約是 USD$1/Mbps (台幣 NTD$30/Mbps),連台灣的頻寬應該是透過 HE 或是 nLayer 連進來。

至於 CDN 服務,我的建議是,如果你不知道哪個比較好,就不要用吧... 等到你開始 profiling & analyze 後再回頭決定。