維基基金會的 2014 年八月月報

維基基金會釋出八月月報 (好像晚了三個月?):「Wikimedia Foundation Report, August 2014」,在「Wikimedia Highlights, August 2014」有比較精簡的版本。

維基基金會在報告裡有提供一些 PV 相關的數據,包括 comScore 的數字與自己 server log 所統計出來的數據。另外也包含了財務狀況。

其中技術相關的是取自「Wikimedia Engineering/Report/2014/August」這頁。另外因為這是八月的資料,我順便偷看了九月與十月的「Wikimedia Engineering/Report/2014/September」與「Wikimedia Engineering/Report/2014/October」。

可以看到在測試 HHVM 的計畫,而且目前看起來還不錯:「[Wikitech-l] [Engineering] Migrating test.wikipedia.org to HHVM」,拿了 test.wikipedia.org 測試,其中 speed test 的部份有大幅改善:

1) Speed test: measure the time taken to request the page 1000 times over just 10 concurrent connections:

                        HHVM    Zend    diff
Mean time (ms):         233     441     -47%
99th percentile (ms):   370     869     -57%
Request/s:              43      22.6    +90%

而負載測試的成果更好:

2) Load test: measure how much thoughput we obtain when hogging the appserver with 50 concurrent requests for a grand total of 10000 requests. What I wanted to test in this case was the performance degradation and the systems resource consumption

                        HHVM    Zend    diff
Mean time (ms):         355     906     -61%
99th percentile (ms):   791     1453    -45%
Request/s:              141     55.1    +156%
Network (Mbytes/s)      17      7       +142%
RAM used (GBs):         5(1)    11(4)
CPU usage (%):          90(75)  100(90)

維基百科之所以沒有遇到太多問題,主要是因為所使用的軟體是 open source 而且夠大的關係,直接成為 HHVM 測試的一環:「Compatibility Update」。

不過目前看起來應該還是跑 PHP,沒有看到整個都轉換過去的計畫。

另外一方面,搜尋引擎的更換就沒有這麼順利,雖然換到 Elasticsearch 後改善不少,不過可以看到八月的報告這樣寫:

tarted deploying Cirrus as the primary search back-end to more of the remaining wikis and we found what looks like our biggest open performance bottleneck. Next month's goal is to fix it and deploy to more wikis (probably not all). We're also working on getting more hardware.

而九月時就講到沒有銀彈,要加硬體去拼:

In September we worked to mitigate the performance bottleneck that we found in August. We found there to be no silver bullet but used the information we learned to pick and order appropriate hardware to handle the remaining wikis. We also implemented out significantly improved wikitext Regular Expression search. In October we've begun rolling out the wikitext Regular Expression search and received some of the hardware we need to finish cutting over the remaining wikis. We believe we'll get it all installed in October and cut the remaining wikis over in November.

十月的時候弄到機器了:

In October we prepared for November in which we deployed Cirrus to all the remaining wikis by installing new servers installing new versions of Elasticsearch and our plugins. We also fixed up regex search which had caused a search outage.

這些報告的連結裡面其實有些不會在對外新聞稿上面的評語... XD

Leave a Reply

Your email address will not be published. Required fields are marked *