t3 也可以上 Dedicated Single-Tenant Hardware 了

AWS 宣佈 t3 系列的機器也可以上 Dedicated Single-Tenant Hardware 了,也就是實體的機器不與其他人共用:「New – T3 Instances on Dedicated Single-Tenant Hardware」。

會需要避免共用實體機器,其中一種常見的是需求是 compliance,主要是在處理資料 (尤其是敏感資料) 時要求實體隔離,以降低 side-channel attack 或是類似攻擊的風險:

Our customers use Dedicated Instances to further their compliance goals (PCI, SOX, FISMA, and so forth), and also use them to run software that is subject to license or tenancy restrictions.

另外一種情境是 AWS 的美國政府區,直接與一般商業區的系統切開,不過這也得有經濟規模才有辦法這樣玩...

AWS 開始推自己的 Machine Learning Chip

除了常見的 GPU 類,以及之前公佈過的 FPGA 外,這次 AWS 推出的是自己做的晶片 AWS Inferentia,以及對應到 EC2 上的機種 inf1:「Amazon EC2 Update – Inf1 Instances with AWS Inferentia Chips for High Performance Cost-Effective Inferencing」。

從介紹可以看到支援的形式:

Each AWS Inferentia chip supports up to 128 TOPS (trillions of operations per second) of performance at low power to enable multiple chips per EC2 instance. AWS Inferentia supports FP16, BF16, and INT8 data types. Furthermore, Inferentia can take a 32-bit trained model and run it at the speed of a 16-bit model using BFloat16.

然後常見的框架都先弄好支援了:

AWS Inferentia comes with the AWS Neuron software development kit (SDK) that enables complex neural net models, created and trained in popular frameworks to be executed using AWS Inferentia based EC2 Inf1 instances. Neuron consists of a compiler, run-time, and profiling tools and is pre-integrated into popular machine learning frameworks including TensorFlow, Pytorch, and MXNet to deliver optimal performance of EC2 Inf1 instances.

現在看起來類似於 Google 弄的 TPU,專為 machine learning 搞出來的 ASIC,等一陣子應該就會有兩者的比較了...

AWS Outposts 總算要開始出貨了

去年 AWSre:Invent 喊的 AWS Outposts 總算是有東西要出貨了:「AWS Outposts Now Available – Order Yours Today!」。

放在自家實體的機櫃,然後掛到 AWS 上變成一個特殊的 region。目前一個特殊的 region 只能放 16 個機櫃,但預期之後可以更多:

Capacity Expansion – Today, you can group up to 16 racks into a single capacity pool. Over time we expect to allow you to group thousands of racks together in this manner.

不過要注意的是,需要有 AWS Enterprise Support 才能下單,而且看起來硬體的維修也包在內了:

Support – You must subscribe to AWS Enterprise Support in order to purchase an Outpost. We will remotely monitor your Outpost, and keep it happy & healthy over time. We’ll look for failing components and arrange to replace them without disturbing your operations.

看了一下價錢的頁面,如果以北美的 upfront 來算,最便宜的是 OR-L8IF4WFOR-I0OGL02 的 USD$225,504.81,最貴的是 OR-HSZHMMF 的 USD$898,129.52,暫時應該用不到 XDDD

Amazon EC2 推出了新一代的 ARM 系統

Amazon EC2 推出了新一代的 ARM 系統:「Coming Soon – Graviton2-Powered General Purpose, Compute-Optimized, & Memory-Optimized EC2 Instances」。

目前的 a1 系列最大到 32GB RAM,這次推出來的算是比較大台的機器,而且與 x86-64 架構相同,分化成 m/c/r 系列了:

  • General Purpose (M6g and M6gd) – 1-64 vCPUs and up to 256 GiB of memory.
  • Compute-Optimized (C6g and C6gd) – 1-64 vCPUs and up to 128 GiB of memory.
  • Memory-Optimized (R6g and R6gd) – 1-64 vCPUs and up to 512 GiB of memory.

預定是 2020 年推出:

I will have more information to share with you in 2020.

不過如果目前想要玩的話,可以找 AWS 申請 m6g 的機器先測試看看:

M6g Preview
We are now running a preview of the M6g instances for testing on non-production workloads; if you are interested, please contact us.

價錢好像也還沒出來,先放著等新消息好了...

家裡電腦裝 Ubuntu 18.04

上個禮拜四家裡的桌機開不了機,找了一天發現是系統的 SSD 掛掉了,就買了張 M.2 SSD,然後計畫順便把本來的 Ubuntu 16.04 升級到 Ubuntu 18.04,但 Ubuntu 18.04 把預設的界面從 Unity 換成 GNOME (然後披上 Unity 的皮),加上前陣子系統從 Intel 平台換到 AMD,整個狀況變得超混亂之後,就變成一連串踩地雷的過程...

最一開始是 UEFI + LUKS 的安裝問題,本來想裝到 M.2 SSD 上面,但 Ubuntu 18.04 的 grub-install 就是硬寫到 /dev/sda 不能改:「“Unable to install GRUB in /dev/sda” when installing GRUB」,照著這篇的 workaround 用還是不行,最後放棄,直接生一顆 SATA SSD 接到 SATA Port 1,把 M.2 當作資料碟。

硬體相關的問題:

軟體相關的問題:

  • 目前不支援從 GUI 設定 PPPoE 的網路 (沃槽),幾種方式裡面我推薦用 pppoeconf 設定會比較好,然後可以改 /etc/ppp/options 加上 IPv6 的設定。
  • 本來想裝 gnome-shell-extension-system-monitor 觀察系統狀態,但會造成系統超級卡,關掉後就變成普通的卡 (後來就找到 Intel I211-AT 的那個問題了)。

現在至少是堪用的程度了,接下來就是不斷的補各種設定...

EC2 的 ARM 也支援 bare metal 版本了...

AWSARM 平台的採用速度比想像中快,昨天發表了 bare metal 版本 a1.metal:「Now Available: Bare Metal Arm-Based EC2 Instances」。

相較於 x86 系列提供的 bare metal,ARM 這邊提供的機器不算大台,只有 16 核心與 32 GB 的記憶體,所以主要的用途應該是在於可以存取系統底層的資訊,或是因為軟體的授權不支援虛擬化而需要租用 bare metal:

  • need access to physical resources and low-level hardware features, such as performance counters, that are not always available or fully supported in virtualized environments,
  • are intended to run directly on the hardware, or licensed and supported for use in non-virtualized environments.

這次的 a1.metal 共八區可以用:

You can start using a1.metal instances today in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Tokyo), Asia Pacific (Mumbai), and Asia Pacific (Sydney).

EC2 推出 18TB 與 24TB 的機器...

AWS 又把機器給生出來啦:「EC2 High Memory Update – New 18 TB and 24 TB Instances」。

一樣是限制要買三年 RI 才能用,不過價錢頁面上好像還在更新,在「Amazon EC2 Dedicated Hosts Pricing」只看到了之前就公佈的 12TB 價錢,還沒看到 18TB 與 24TB 的部份...

然後以前會跟同事說,資料小於這台機器記憶體大小的不能叫 big data (當時是 12TB),現在升級到 24TB 啦...

Google Chrome 對 CPU bug 的 patch

既然有方向了,後續應該會有人去找底層的問題...

先是在 Hacker News 上看到「Speculative fix to crashes from a CPU bug」這個猜測性的修正,這是因為他們發現在 IntelGemini Lake 低功耗晶片組上會發生很詭異的 crash:

For the last few months Chrome has been seeing many "impossible" crashes on Intel Gemini Lake, family 6 model 122 stepping 1 CPUs. These crashes only happen with 64-bit Chrome and only happen in the prologue of two functions. The crashes come and go across different Chrome versions.

然後依照 crash log 猜測跟 alignment 有關,所以決定用 gcc/clang 都有支援的 __attribute__ 強制設定 alignment 來避開,但看起來手上沒有可以重製的環境,所以只能先把實做丟上來...

IBM 把 OpenPOWER Foundation 交給 The Linux Foundation

標題雖然是「Big Blue Open Sources Power Chip Instruction Set」,但實質上應該就是 IBMOpenPOWER Foundation 交給 The Linux Foundation

找了一下兩邊的新聞稿,其中 The Linux Foundation 的新聞稿在「The Linux Foundation Announces New Open Hardware Technologies and Collaboration」這邊,但 OpenPOWER 的網站好像從 2018 年年底就沒更新了...

開放硬體最近比較紅的應該是 RISC-VOpenRISC 這些專案?IBM 這一招不知道是怎麼樣...

Apple 提供蝴蝶鍵盤免費維修 (全球性)

翻到文章的最後面可以看到「Information as of 2019-05-21」,不過剛剛才在 Hacker News 上看到這則消息:「Apple's service program for butterfly keyboard MacBooks, even out of warranty (support.apple.com)」,官方網站的說明在「Keyboard Service Program for MacBook, MacBook Air, and MacBook Pro」這邊:

Apple has determined that a small percentage of the keyboards in certain MacBook, MacBook Air, and MacBook Pro models may exhibit one or more of the following behaviors:

  • Letters or characters repeat unexpectedly
  • Letters or characters do not appear
  • Key(s) feel "sticky" or do not respond in a consistent manner

Apple or an Apple Authorized Service Provider will service eligible MacBook, MacBook Air, and MacBook Pro keyboards, free of charge. The type of service will be determined after the keyboard is examined and may involve the replacement of one or more keys or the whole keyboard.

機型從 MacBook (Retina, 12-­inch, Early 2015) 到最近的都有,可以從系統選單上面看到。時間上只要是售出四年內都包含在內,而且先前如果有因為鍵盤維修的也可以試著申請退費:

This worldwide Apple program does not extend the standard warranty coverage of your Mac notebook.

If you believe your Mac notebook was affected by this issue, and you paid to have your keyboard repaired, you can contact Apple about a refund.

The program covers eligible MacBook, MacBook Air, and MacBook Pro models for 4 years after the first retail sale of the unit.