在 PostgreSQL 上直接掛 ML extension

Hacker News 首頁上看到「Show HN: PostgresML, now with analytics and project management (postgresml.org)」這個專案,可以在 PostgreSQL 上面直接掛 extension 跑 ML algorithm:「PostgresML - an end-to-end machine learning solution」,從 GitHub 上可以看到大多數是 Python 的程式碼。

從 GitHub 頁面上面可以看到這個專案還在比較早期的階段:

This project is currently a proof of concept. Some important features, which we are currently thinking about or working on, are listed below.

如果是目前要用的話,主要是方便看一些東西吧?可以想到的是掛個 replication 出來跑一些 query,這樣不會影響到 production database 的效能,應該還行...

另外看了一下支援的演算法,主要是以經典的 ML 演算法為主,而且就是套用 Python 上面的套件:XGBoostscikit-learn

這些演算法算是很好用了,而且掛到 PostgreSQL 裡面會讓使用上方便很多 (少了倒資料的動作,不過就得小心處理 dirty data 了),然後專案也附上一個 UI 界面可以看一些資料,不過我猜還是用其他生 visualization 的工具會比較豐富一點:

另外一個想法是拿來學習還不錯?老師在上課的時候拿來示範一些演算法,就不用自己再刻很多程式碼...

Amazon EC2 推出第一款 Bare Metal 的 Instance

Amazon EC2 直接租整台主機出來了:「Amazon EC2 Bare Metal Instances with Direct Access to Hardware」。

Bare Metal 怎麼翻譯比較好啊?雖然知道是拔掉虛擬化的主機... 裸奔機?

We knew that other customers also had interesting use cases for bare metal hardware and didn’t want to take the performance hit of nested virtualization. They wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.

反正這種機器就是要壓榨整台機器的效能,所以不會拿小台機器出來給大家玩。這次推出的是 i3 系列,叫做 i3.metal

Today we are launching a public preview the i3.metal instance, the first in a series of EC2 instances that offer the best of both worlds, allowing the operating system to run directly on the underlying hardware while still providing access to all of the benefits of the cloud. The instance gives you direct access to the processor and other hardware, and has the following specifications:

Processing – Two Intel Xeon E5-2686 v4 processors running at 2.3 GHz, with a total of 36 hyperthreaded cores (72 logical processors).
Memory – 512 GiB.
Storage – 15.2 terabytes of local, SSD-based NVMe storage.
Network – 25 Gbps of ENA-based enhanced networking.

走了十年總算走到這塊了... 不過應該花了不少時間解決各種安全性的問題,像是 network isolation 以及反刷韌體的問題 XD

在面試時的資料結構與演算法的問題

在「500 Data structures and algorithms interview questions and their solutions」這邊看到在 Quora 上整理出來的題目 (以及解答)。

每個題目下面也都有地方可以留言,等久一點應該會更豐富?

另外一個不錯的地方在於題目的分類,舉例來說,如果想要練習 Backtracking,可以去翻對應的題目出來練。