Common Media Client Data (CMCD)

Amazon CloudFront 的公告看到的東西:「Amazon CloudFront now supports Common Media Client Data (CMCD) fields in real-time logs」。

CMCD (Common Media Client Data) 是一個私有的 log format,可以讓 client 透過 HTTP(S) header 回報狀態,所以 AWS 這邊有提到你可以設定讓 CloudFront 記錄下來,然後透過其他軟體分析:

Previously, CloudFront logged CMCD parameters as part of the full query string log field, or the HTTP headers log field. Now, you can simply select to include specific CMCD parameters in your real-time logs and save on compute needed to search and extract the CMCD key-value pairs, and reduce the data set for your log analysis.

現在 CloudFront 則是支援判讀做一些處理:

Starting today, you can enable Common Media Client Data (CMCD) fields in your CloudFront real-time logs. You can select key client-side performance parameters and CloudFront delivery performance parameters in the same log record. This can help you correlate variations in Quality of Experience (QoE) for your viewers to CloudFront performance at the granularity of single viewer sessions, simplifying the troubleshooting of QoE issues that impact your viewers engagement.

看了一下就是定義一些欄位,用 JSON 包起來後讓 client 打回伺服器端,可以利用這些資訊動態從伺服器端調整策略,而不像以前主要的調整策略都放在 client 端。

Mozilla 實做百度發表的 Speech-To-Text 引擎 Deep Speech

Hacker News 上看到 MozillaGitHub 上的 mozilla/DeepSpeech 這個專案,用 TensorFlow 實做了百度的「Deep Speech: Scaling up end-to-end speech recognition」論文:

A TensorFlow implementation of Baidu's DeepSpeech architecture

語音轉文字的方案,Mozilla 開專案實做出來了...

這程式碼需要安裝 Git Large File Storage 才能完整下載包含訓練資料的部份:

Manually install Git Large File Storage, then clone the repository normally:
git clone https://github.com/mozilla/DeepSpeech

而目前已經有的資料來自於 Mozilla 另外一個專案「Common Voice」:

The Common Voice project is Mozilla's initiative to help teach machines how real people speak.

Common Voice 這個專案目前只有英文,網頁上就可以參與 validation 過程...

Mac 上的 Cleartext

看到 Mac 上的「Cleartext」這個軟體:

A text editor that only allows the 1,000 most common words in English

限制你使用比較簡單的英文,這樣可以讓讀的人比較容易了解 (尤其是非母語的人)。

有種跟 Simple English Wikipedia 的想法很像的感覺:

The project uses around 2,000 common English words, and is based on Basic English, an 850-word auxiliary international language created by Charles Kay Ogden in the 1920s.

另外還有提供 Trump mode XDDD:

Trained with a few of Trump's best known speeches, the app is now ready to help you write like a billionaire.

這好壞 XDDD