JSON 的 Object 裡 Key 重複的問題

tl;dr:不要亂來啦... 這是 UB (Undefined behavior) 的一種。

因為看到這則 tweet,所以去查一下 JSON 的資料:

首先是找標準是什麼。在維基百科的 JSON 條目裡提到了有兩份標準,一份是 RFC 7159,一份是 ECMA-404

Douglas Crockford originally specified the JSON format in the early 2000s; two competing standards, RFC 7159 and ECMA-404, defined it in 2013. The ECMA standard describes only the allowed syntax, whereas the RFC covers some security and interoperability considerations.

ECMA-404 裡面就真的只講語法沒講其他東西,而在 RFC 7159 內的 Object 則是有提到 (重點我就用粗體標起來了):

An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string. A single colon comes after each name, separating the name from the value. A single comma separates a value from a following name. The names within an object SHOULD be unique.

   object = begin-object [ member *( value-separator member ) ]
            end-object

   member = string name-separator value

An object whose names are all unique is interoperable in the sense that all software implementations receiving that object will agree on the name-value mappings. When the names within an object are not unique, the behavior of software that receives such an object is unpredictable. Many implementations report the last name/value pair only. Other implementations report an error or fail to parse the object, and some implementations report all of the name/value pairs, including duplicates.

JSON parsing libraries have been observed to differ as to whether or not they make the ordering of object members visible to calling software. Implementations whose behavior does not depend on member ordering will be interoperable in the sense that they will not be affected by these differences.

粗體有描述唯一性,但尷尬的地方在於他用 SHOULD 而非 MUST,所以 library 理論上都要能接受。但後面提到如果不唯一時,行為無法預測 (會到 rm -rf / 嗎?XDDD 最像的應該還是 crash?),所以還是不要亂來啦...

不過如果真的會 crash 的話,應該也會因為 DoS issue 而被發 CVE,所以實務上應該是不會 crash 啦...

PChome 修正了問題,以及 RFC 4074 的說明

早些時候測試發現 PChome 已經修正了之前提到的問題:「PChome 24h 連線會慢的原因...」、「PChome 24h 連線會慢的原因... (續篇)」,這邊除了整理一下以外,也要修正之前文章裡的錯誤。

在 RFC 4074 (Common Misbehavior Against DNS Queries for IPv6 Addresses) 裡面提到了當你只有 IPv4 address 時,DNS server 要怎麼回應的問題。

在「3. Expected Behavior」說明了正確的作法,當只有 A RR 沒有 AAAA RR 的時候,應該要傳回 NOERROR,而 answer section 裡面不要放東西:

Suppose that an authoritative server has an A RR but has no AAAA RR for a host name. Then, the server should return a response to a query for an AAAA RR of the name with the response code (RCODE) being 0 (indicating no error) and with an empty answer section (see Sections 4.3.2 and 6.2.4 of [1]). Such a response indicates that there is at least one RR of a different type than AAAA for the queried name, and the stub resolver can then look for A RRs.

在「4.2. Return "Name Error"」裡提到,如果傳回 NXDOMAIN (3),表示查詢的這個名稱完全沒有 RR,而不僅僅限於 AAAA record,這就是我犯的錯誤 (在前面的文章建議傳回 NXDOMAIN):

This type of server returns a response with RCODE 3 ("Name Error") to a query for an AAAA RR, indicating that it does not have any RRs of any type for the queried name.

With this response, the stub resolver may immediately give up and never fall back. Even if the resolver retries with a query for an A RR, the negative response for the name has been cached in the caching server, and the caching server will simply return the negative response. As a result, the stub resolver considers this to be a fatal error in name resolution.

Several examples of this behavior are known to the authors. As of this writing, all have been fixed.

PChome 這次的修正回應了正確的值 (而不是我提到的 NXDOMAIN):

$ dig shopping.gs1.pchome.com.tw aaaa @ns1.gs1.pchome.com.tw

; <<>> DiG 9.9.5-3ubuntu0.16-Ubuntu <<>> shopping.gs1.pchome.com.tw aaaa @ns1.gs1.pchome.com.tw
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<<<- opcode: QUERY, status: NOERROR, id: 40767
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;shopping.gs1.pchome.com.tw.    IN      AAAA

;; AUTHORITY SECTION:
gs1.pchome.com.tw.      5       IN      SOA     ns1.gs1.pchome.com.tw. root.dns.pchome.com.tw. 20171123 3600 3 3600 5

;; Query time: 16 msec
;; SERVER: 210.242.216.91#53(210.242.216.91)
;; WHEN: Fri Nov 24 01:44:52 CST 2017
;; MSG SIZE  rcvd: 134

另外 RFC 也有一些其他的文件可以參考,像是 RFC 2308 (Negative Caching of DNS Queries (DNS NCACHE))、RFC 4697 (Observed DNS Resolution Misbehavior) 以及 RFC 8020 (NXDOMAIN: There Really Is Nothing Underneath),這些文件描述了蠻多常見的問題以及正確的處理方法,讀完對於現在愈來愈複雜的 DNS 架構有不少幫助。

Mozilla 的提案「HTTP Immutable Responses」

狀態已經是 Category: Standards Track 了,RFC 8246 的「HTTP Immutable Responses」:

The immutable HTTP response Cache-Control extension allows servers to identify resources that will not be updated during their freshness lifetime. This ensures that a client never needs to revalidate a cached fresh resource to be certain it has not been modified.

Cache-Control 介紹了 immutable,像是這樣:

Cache-Control: max-age=31536000, immutable

依照 MDN 上的資料 (Cache-Control - HTTP | MDN),目前只有 EdgeFirefox 支援,不過既然成為標準了,後續其他瀏覽器應該都會支援 (吧):

AES-GCM-SIV

在「AES-GCM-SIV: Specification and Analysis」這邊看到 AES-GCM-SIV 的作者自己投稿上去的資料,是個已經被放進 BoringSSL 並且在 QUIC 上使用的演算法:

We remark that AES-GCM-SIV is already integrated into Google's BoringSSL library \cite{BoringSSL}, and its deployment for ticket encryption in QUIC \cite{QUIC} is underway.

在 RFC 上的說明解釋了這個演算法的目的是希望當 nonce 沒有被正確實作時仍然可以有比 AES-GCM 強的保護:

This memo specifies two authenticated encryption algorithms that are nonce misuse-resistant - that is that they do not fail catastrophically if a nonce is repeated.

在 128 bits 的情況下,加密的速度大約是 AES-GCM 的 2/3 (在都有硬體加速的情況下),但解密的速度則與 AES-GCM 相當:

For encryption, it is slower than AES-GCM, because achieving nonce-misuse resistance requires, by definition, two (serialized) passes over the data. Nevertheless, optimized implementations run GCM-SIV (for 128-bit keys) at less than one cycle per byte on modern processors (roughly 2/3 of the speed of nonce-respecting AES-GCM). On the other hand, GCM-SIV decryption runs at almost the same speed as AES-GCM.

不過這就是 trade-off 了,如果 nonce 有正確被實作的話,其實不需要這個...

舊 bug 新名字:httpoxy

依照慣例,security issue 都會取個名字,這次叫做 httpoxy:「A CGI application vulnerability for PHP, Go, Python and others」。

事情發生在兩個命名變數上的衝突:

  • RFC 3875 (The Common Gateway Interface (CGI) Version 1.1) 定義了 CGI 環境會把 Header 裡的 Proxy 欄位放到環境變數裡的 HTTP_PROXY
  • 而很多程式會拿環境變數裡的 HTTP_PROXY 當作 proxy 設定。

這件事情 2001 年在 libwww-perl 就有發生過 (並且修正),curl 也發生過 (然後修正),2012 年在 Ruby 的 Net::HTTP 也發生過 (也修正了)。

然後在 2016 年還是被發現有很多應用程式會中獎... 這頭好痛啊 :o

RFC 7763:text/markdown

Markdown 的 RFC:「The text/markdown Media Type」。

This document registers the text/markdown media type for use with Markdown, a family of plain-text formatting syntaxes that optionally can be converted to formal markup languages such as HTML.

雖然是 Category: Informational,但有個標準後是不是有機會在瀏覽器裡面原生支援?

GitHub 支援 HTTP Code 451 了...

GitHub 宣佈支援 HTTP Code 451 了:「The 451 status code is now supported」。也就是 RFC 7725 的「An HTTP Status Code to Report Legal Obstacles」。

目前會把因為 DMCA takedown notice 下架的內容以 HTTP Code 451 標出:

The GitHub API will now respond with a 451 status code for resources it has been asked to take down due to a DMCA notice.

HTTP Code 451 的點子出自「華氏 451 度」這本書,表示紙的燃點。

奇怪的 RFC:Naming Things with Hashes

看到「RFC 6920: Naming Things with Hashes」這個,看日期是 April 2013,就在想是不是四月一號發的... 但內容看起來還頗有用的,有種 distributed web 的味道?文件裡給的範例長這樣:

<html>
 <head>
   <title>ni: relative URI test</title>
   <base href="ni://example.com">
 </head>
 <body>
   <p>Please check <a href="sha-256;f4OxZX...">this document</a>.
     and <a href="sha-256;UyaQV...">this other document</a>.
     and <a href="sha-256-128;...">this third document</a>.
   </p>
 </body>
</html>

目前是 Propsed Standard,所以是怎樣呢...

HTTP Status Code 451

前陣子送出的 HTTP Status Code 451 要通過成為標準了:「Why 451?」。

Today, the IESG approved publication of "An HTTP Status Code to Report Legal Obstacles". It'll be an RFC after some work by the RFC Editor and a few more process bits, but effectively you can start using it now.

取自「華氏451度」這部講出版物言論自由的作品 (紙的燃點是華氏 451 度),在 Internet 時代,451 剛好在 HTTP Status Code 4xx 的範圍,被拿來用做「因法令限制而服法提供內容的 Status Code」。

在文件開頭說明了這個代碼的用途:

This document specifies a Hypertext Transfer Protocol (HTTP) status code for use when resource access is denied as a consequence of legal demands.

RFC7686:保留 .onion 給 Tor 的 Hidden Services 使用

看到 Tor Project 很高興的宣佈 .onion 這個 TLD 在 RFC 7686 成為 Standards Track:「Landmark for Hidden Services: .onion names reserved by the IETF」。

而且也因為成為 IETF 的標準,在 CA/Browser Forum 上更有依據討論在上面的 CA 架構:

With this registration, it is should also be possible to buy Extended Validation (EV) SSL/TLS certificates for .onion services thanks to a recent decision by the Certification Authority Browser Forum.