Home » Computer » Archive by category "Programming" (Page 2)

基於 RNN 的無損壓縮

Hacker News 上看到「DeepZip: Lossless Compression using Recurrent Networks」這篇論文,利用 RNN 幫助壓縮技術壓的更小,而程式碼在 GitHubkedartatwawadi/NN_compression 上有公開讓大家可以測試。

裡面有個比較特別的是,Lagged Fibonacci PRNG 產生出來的資料居然有很好的壓縮率,這在傳統的壓縮方式應該都是幾乎沒有壓縮率...

整體的壓縮率都還不錯,不過比較的對象只有 gzip,沒有拿比較先進的壓縮軟體進行比較) 像是 xz 之類的),看數字猜測在一般的情況下應該不會贏太多,不過光是 PRNG 那部份,這篇論文等於是給了一個不同的方向讓大家玩...

Windows 10 將支援 AF_UNIX (Unix Socket)

在「Unix sockets come to Windows」這邊看到微軟的說明文「AF_UNIX comes to Windows」,Windows 10 將要引入 AF_UNIX 了:

Beginning in Insider Build 17063, you’ll be able to use the unix socket (AF_UNIX) address family on Windows to communicate between Win32 processes. Unix sockets allow inter-process communication (IPC) between processes on the same machine.

所以這讓跨 process 溝通的方式又多了一種,而 Unix 的程式如果要移植到 Windows 上,至少這塊有相容... (相容性與 bug 還不知道情況 XD)

用 Composer 的 require 限制,擋掉有安全漏洞的 library...

查資料的時候查到的,在 GitHub 上的 Roave/SecurityAdvisories 這個專案利用 Composerrequire 條件限制,擋掉有安全漏洞的 library:

This package ensures that your application doesn't have installed dependencies with known security vulnerabilities.

看一下 composer.json 就知道作法了,裡面的 description 也說明了這個專案的用法:

Prevents installation of composer packages with known security vulnerabilities: no API, simply require it

這方法頗不賴的 XDDD

兩個 gperf...

翻資料的時候覺得怎麼跟印象中的不太一樣,多花些時間翻了一下,發現原來有兩個東西同名...

一個是 GNUgperf,給定字串集合,產生 C 或 C++ 的 perfect hash function (i.e. no collision):

GNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only.

另外一個是 Google 弄出來的 gperftoolsmalloc() 的替代品以及效能分析工具:

gperftools is a collection of a high-performance multi-threaded malloc() implementation, plus some pretty nifty performance analysis tools.

CPU 成為現代網站的速度瓶頸

在「Tracking CPU with Long Tasks API」這邊提到的現象,雖然是在提新的 API,不過裡面提到了很重要的問題。

以前的網站因為 js 都沒有用的那麼多,所以主要的瓶頸在於網路速度。所以大家最佳化的方向都是往「如何讓傳輸量變小」的方式進行,像是各類 js 的 minify,甚至是對 Gzip 演算法的暴力改善 (維持相容的 Zopfli,以及新的 Brotli):

In the old days, delivering a fast user experience depended primarily on download speed. One reason why the network was the main bottleneck back then is that JavaScript and CSS weren’t used as much as they are now, so CPU wasn’t a critical factor.

而現代網站使用 js 的情況已經是來到了新的境界 (甚至很多網站是沒有 js 就不會動),於是對於 CPU 的能力就愈來愈要求:

According to the HTTP Archive, the top 1000 websites download five times more JavaScript today compared to seven years ago.

而手機也愈來愈普及,CPU 的能力相較起來就更嚴峻了...

新出的 RFC 8259:The JavaScript Object Notation (JSON) Data Interchange Format

JSON 的規格書又被更新了 XD

在「The Last JSON Spec」這邊,Tim Bray 寫了這篇關於新的 RFC 8259 跟之前的差異,以及大家對於雙重標準的顧慮。

最大的差異在於,在 RFC 8259 規定了「如果 JSON 被用在非封閉的系統交換資料,必須使用 UTF-8」:

8259 con­tains one new sen­tence: “JSON text ex­changed be­tween sys­tems that are not part of a closed ecosys­tem MUST be en­cod­ed us­ing UTF-8 [RFC3629].” Giv­en that, by 2017, an at­tempt to ex­change JSON en­cod­ed in any­thing but UTF-8 would be ir­ra­tional, this hard­ly needs say­ing; but its ab­sence felt like an omis­sion.

而關於 ECMA-404 與 RFC 8259 都定義了 JSON 的問題他也說明了,因為很多人花了很多力氣在確保這兩份文件的正確性上,所以應該不會有問題 (i.e. 衝突):

The rea­son 8259 ex­ists is that the ECMAScript gang went and wrote their own ex­treme­ly min­i­mal spec, Stan­dard ECMA-404: The JSON Da­ta In­ter­change Syn­tax, and there was rea­son for con­cern over du­el­ing stan­dard­s. But, af­ter a cer­tain amount of standards-org elephant-gavotte, each of ECMA 404 and RFC 8259 nor­ma­tive­ly ref­er­ences the oth­er and con­tains a com­mit­ment to keep them con­sis­tent in case any er­rors turn up. Which is a good thing, but this text has been re-examined and re-polished so many times that I doubt ei­ther side will ev­er re­vis­it the ter­ri­to­ry, thank good­ness.

另外他也提到了對於不同情境下可以看不同的文件。像是要了解 JSON 的話,可以看當初發明 JSON 的 Doug Crockford 所設立的網站 (在「JSON」這邊);而在交換時應該參考 I-JSON (Internet JSON,RFC 7493):

Which spec should you use? · If you want to un­der­stand JSON syn­tax, you still can’t beat Doug Crockford’s orig­i­nal for­mu­la­tion at JSON.org. If you want to use an RFC as foun­da­tion for a REST API or some oth­er In­ter­net pro­to­col, I ac­tu­al­ly don’t rec­om­mend 8259, I rec­om­mend I-JSON, RFC 7493, which de­scribes ex­act­ly the same syn­tax as all the oth­er specs (by ref­er­enc­ing 7159), but ex­plic­it­ly rules out some legal-but-dumbass things you could do that might break your pro­to­col, for ex­am­ple us­ing any­thing but UTF-8 or hav­ing du­pli­cate mem­ber names in your ob­ject­s.

I-JSON 是 JSON 的子集合,比較重要的:

  • (MUST) 使用 UTF-8。
  • (SHOULD NOT) 浮點數的部份,不得超過 IEEE 754-2008 binary64 (double precision) 的範圍。
  • (SHOULD NOT) 整數的部份,不得超過 [-(2**53)+1, (2**53)-1]) 的範圍。
  • (RECOMMEND) 有超過的需求使用字串表示。
  • (MUST NOT) JSON object 內不得有重複的 name。
  • (SHOULD NOT) 最上層的型態不得使用字串,只能使用 object 或是 array。
  • (MUST NOT) 遇到先前沒有定義過的元素不得視為錯誤。(像是新版 API 內會在 object 裡增加元素)
  • (RECOMMEND) 時間使用 ISO 8601 表示 (在 RFC 3339 有提到),英文字的部份全部使用大寫,一定要標上時區,而秒數的 0 一定要加上去 (也就是 00 秒)。
  • (RECOMMEND) 時間長度也建議依照 RFC 3339 處理。
  • (RECOMMEND) Binary 資料用 base64url 傳 (RFC 4648)。

微軟在考慮讓 Excel 支援 Python...

在「Excel team considering Python as scripting language: asking for feedback」這邊看到微軟正在考慮要不要讓 Excel 支援 Python,出自 UserVoice 上的:「How can we improve Excel for Windows (Desktop Application)?」。

比較感覺到有可能性應該是因為微軟做了一個問卷收集資訊:「Python and Excel」。

不過本來的功能就已經可以用到很出神入化了... XD (想到最近提到的「LINE 將內部的座位表由 Excel 改成 Web 界面...」)

Amazon Aurora (MySQL) 的 Stored Procedure 可以跑 AWS Lambda...

查了資料才發現去年十月 Amazon Aurora (MySQL-Compatible Edition) 就支援用 AWS Lambda 當 stored procedure 了,只是當時只支援 async mode,能做的事情比較有限:「Amazon Aurora New Features: AWS Lambda Integration and Data Load from Amazon S3 to Aurora Tables」。

Now you can invoke Lambda functions directly from within an Aurora database via stored procedures or user-defined functions. Lambda integration allows you to extend the capabilities of the database and invoke external applications to act upon data changes. For example, you can create a Lambda function that sends emails to customers whenever their address in the database is updated.

前幾天發表的則是支援 sync mode,可以等到:「Amazon Aurora with MySQL Compatibility Natively Supports Synchronous Invocation of AWS Lambda Functions」。

Starting with version 1.16, we are extending this feature to be able to able to synchronously invoke Lambda functions.

Use the native function lambda_sync when you must know the result of the execution before moving on to another action.

這解掉了 MySQL 的 stored procedure 一直很殘的問題...

AWS CodeBuild 也可以產生 Badge 給網頁用 (像是掛在 GitHub 的 README 裡)

在「Build Badges Sample with AWS CodeBuild」這邊看到 AWS CodeBuild 支援 Badge 的方法。最常用的 GitHub 是給 AWS CodeBuild 授權後取得:

If you chose GitHub, follow the instructions to connect (or reconnect) with GitHub. On the GitHub Authorize application page, for Organization access, choose Request access next to each repository you want AWS CodeBuild to be able to access. After you choose Authorize application, back in the AWS CodeBuild console, for Repository, choose the name of the repository that contains the source code. Select the Build Badge check box to make your project's build status visible and embeddable.

有四種狀態:

  • PASSING The most recent build on the given branch passed.
  • FAILING The most recent build on the given branch timed out, failed, faulted, or was stopped.
  • IN_PROGRESS The most recent build on the given branch is in progress.
  • UNKNOWN The project has not yet run a build for the given branch or at all. Also, the build badges feature might
    have been disabled.

不過我還是偏愛 3rd party 的組合,不是很愛用 AWS CodeXXX 系列的服務就是了... 唯一一個用的是 AWS CodeCommit 因為有永久的免費額度可以用。

Googlebot 的 Web rendering service 的細節

在「Polymer 2 and Googlebot」這邊文章裡面才看到 Google 官方在今年八月就有公開 Googlebot 所使用的 Web rendering service (WRS) 的細節:「Rendering on Google Search」。可以想像到是基於 Google Chrome 的修改:

Googlebot uses a web rendering service (WRS) that is based on Chrome 41 (M41). Generally, WRS supports the same web platform features and capabilities that the Chrome version it uses — for a full list refer to chromestatus.com, or use the compare function on caniuse.com.

裡面提到一些值得注意的事情,像是不支援 WebSocket,所以對於考慮 Google 搜尋結果的頁面來說,就要注意錯誤處理了...

Archives