Backblaze 的 2019 年度硬碟報告

Backblaze 丟出去年的報告了:「Backblaze Hard Drive Stats for 2019」。

WD/HGST 的還是最耐用,再來是 Toshiba 的,最後是 Seagate 的。

不過有一些硬碟沒有列到表上,像是「Seagate 16 TB Drives」這組因為 2019Q4 才剛裝上去,所以才 1440 drive days,因此還沒到門檻所以沒放進報告,但就 Backblaze 測試起來看起來是個好的開始:

In Q4 2019 we started qualifying Seagate 16 TB drives, model: ST16000NM001G. As of the end of Q4 we had 40 (forty) drives in operation, with a total of 1,440 drive days—well below our 5,000 drive day threshold for Q4, so they didn’t make the 2019 chart. There have been 0 (zero) failures through Q4, making the AFR 0%, a good start for any drive. Assuming they continue to pass our drive qualification process, they will be used in the 12 TB migration project and to add capacity as needed in 2020.

再來是把 2017/2018/2019 擺在一起看:

馬上可以看到的是 AFR 上升了不少,一個是因為 8TB 系列的硬碟進入中年期,另外是 Seagate 12TB 硬碟的問題:

The total AFR for 2019 rose significantly in 2019. About 75% of the different drive models experienced a rise in AFR from 2018 to 2019. There are two primary drivers behind this rise. First, the 8 TB drives as a group seem to be having a mid-life crisis as they get older, with each model exhibiting their highest failure rates recorded. While none of the rates is cause for worry, they contribute roughly one fourth (1/4) of the drive days to the total, so any rise in their failure rate will affect the total. The second factor is the Seagate 12 TB drives, this issue is being aggressively addressed by the 12 TB migration project reported on previously.

所以大原則還是跟以前差不多,沒有時間特別研究的話就先往 WD/HGST 這邊找...

Backblaze 採購硬碟的策略

在「How Backblaze Buys Hard Drives」這篇裡面提到了 Backblaze 採購硬碟的策略,可以看到完全都是偏成本走向,所以裡面的策略一般個人用不太到,一般企業也不應該照抄,但拿來看看還蠻有趣的...

像是因為硬碟太多,所以硬碟的使用電量是他們在評估成本時蠻重要的一環,這點在一般的情境下不太會考慮到:

Power draw is a very important metric for us and the high speed enterprise drives are expensive in terms of power cost. We now total around 1.5 megawatts in power consumption in our centers, and I can tell you that every watt matters for reducing costs.

另外也提到了 SMR 硬碟的特性,在單位成本雖然有比較高的容量,但導致架構面需要配合 (cache),而也會有工程端的成本提昇,所以不是很愛:

SMR would give us a 10-15% capacity-to-dollar boost, but it also requires host-level management of sequential data writing. Additionally, the new archive type of drives require a flash-based caching layer. Both of these requirements would mean significant increases in engineering resources to support and thereby even more investment. So all-in-all, SMR isn’t cost-effective in our system.

成本面上,他們觀察到的現象是每季會降 5%~10%:

Ideally, I can achieve a 5-10% cost reduction per terabyte per quarter, which is a number based on historical price trends and our performance for the past 10 years.

另外提到了用 SAS controller 可以接多個 SATA 硬碟的事情 (雖然還是成本考量),但這塊也蠻有趣的:

Longer term, one thing we’re looking toward is phasing out SATA controller/port multiplier combo. This might be more technical than some of our readers want to go, but: SAS controllers are a more commonly used method in dense storage servers. Using SATA drives with SAS controllers can provide as much as a 2x improvement in system throughput vs SATA, which is important to me, even though serial ATA (SATA) port multipliers are slightly less expensive. When we started our Storage Pod construction, using SATA controller/port multiplier combo was a great way to keep costs down. But since then, the cost for using SAS controllers and backplanes has come down significantly.

AWS 的 EBS 預設型態改為 GP2 (SSD)

AWS 宣佈 EBS 的預設型態從 Standard 變成 GP2:「EBS default volume type updated to GP2」。

包括 web console 與 API 的預設值都改成 GP2:

The AWS console defaults to GP2 in all regions. On July 29th the default EBS volume type was updated in thirteen regions from Standard to GP2. Now AWS API calls for volume, image, and instance creation also default to GP2 in all regions.

GP2 是 SSD,所以可以提供比較低的 latency,而另外一個用 GP2 的好處是 i/o 的費用已經含在內了 (Standard 會另外收取費用),對於成本估算會比較簡單一些,尤其是 i/o 量比較大的時候。

Backblaze 在 2018 Q2 的硬碟故障率報告

Backblaze 照慣例發表了 2018 Q2 的硬碟狀況:「Hard Drive Stats for Q2 2018」。可以看出來他們基本上只用 SeagateHGST 了,其他的應該是嘗試性質:

第二張因為把信心區間標出來,會更清楚:

WD 差了一大截,所以現在消費級的硬碟會選 Seagate 或是 HGST?

Backblaze 的 2017 年硬碟年度報告

Backblaze 照慣例發表了 2017Q4 與 2017 全年的硬碟報告出來了:「Backblaze Hard Drive Stats for 2017」。

最重要就這三張圖表,第一張是 2017Q4 資料,第二張是從 2013/04 到 2017/12 的資料,第三張是這三年的資料 (2015/2016/2017):

我先說一下結論,因為這幾年幾乎都只採購 SeagateHGST 的硬碟,所以要用他們的資料判斷 WDToshiba 的硬碟已經沒有價值了。

唯一有價值的資料是 HGST 的硬碟比 Seagate 好不少,要做出其他結論的樣本數都不夠。

The DUHK Attack:因為亂數產生器的問題而造成的安全漏洞

Bruce Schneier 那邊看到的:「Attack on Old ANSI Random Number Generator」,攻擊的網站在「The DUHK Attack」,論文在「Practical state recovery attacks against legacy RNG implementations (PDF)」。

攻擊的對象是 ANSI X9.31 Random Number Generator:

DUHK (Don't Use Hard-coded Keys) is a vulnerability that affects devices using the ANSI X9.31 Random Number Generator (RNG) in conjunction with a hard-coded seed key.

然後攻擊的對象是 FortinetFortiOS

Traffic from any VPN using FortiOS 4.3.0 to FortiOS 4.3.18 can be decrypted by a passive network adversary who can observe the encrypted handshake traffic.

如果照說明的只到 4.3.18,那麼去年 11 月更新的 4.3.19 (參考「FortiOS 4.3.19 Release Notes」) 應該是修正了?不過裡面沒翻到類似的資料,是剛好把 RNG 換掉了嗎?

Backblaze 2017Q1 對硬碟的分析

Backblaze 放出 2017Q1 對硬碟的分析資料:「Hard Drive Stats for Q1 2017」。

相較於之前的報告 (Backblaze Hard Drive Stats for 2016),這次則是把硬碟數量考慮進去,做了一份有正負誤差的:

最近的趨勢沒什麼變,整體上來看 HGST 的品質還是最好的。

Backblaze 對 2016 硬碟損耗狀況的分析

Backblaze 針對 2016 年的硬碟損耗狀況進行分析,這次因為剛好是跨年,所以包括了 2016Q4 與 2016 全年度的資料:「Backblaze Hard Drive Stats for 2016」。

大家比較有興趣的重點應該是全年度的廠牌比較:

HGST 的品質還是很厲害,另外也可以看到 Backblaze 愈來愈不使用 WD 的硬碟了...

ING Bank 在羅馬尼亞的機房出事...

ING Bank 在羅馬尼亞的機房發生資料損毀:「A Loud Sound Just Shut Down a Bank's Data Center for 10 Hours」。

不過原因是因為火災測試時噴發的音量太大,導致硬碟故障 XDDD

ING Bank’s main data center in Bucharest, Romania, was severely damaged over the weekend during a fire extinguishing test. In what is a very rare but known phenomenon, it was the loud sound of inert gas being released that destroyed dozens of hard drives. The site is currently offline and the bank relies solely on its backup data center, located within a couple of miles’ proximity.

報導給了個測試影片,示範超大的音量會對硬碟有什麼影響: