NIST P-curve 的 Seed Bounty Program

Filippo Valsorda 發起了 seed bounty program,針對 NIST P-curve 裡 seed 的部分尋找 SHA-1 的 pre-image:「Announcing the $12k NIST Elliptic Curves Seeds Bounty」。

先講一下這次的 bounty program,希望找出下面這些 SHA-1 的 pre-image input (也就是找出 input,使得 SHA1(input) 會等於下面的東西):

3045AE6FC8422F64ED579528D38120EAE12196D5
BD71344799D5C7FCDC45B59FA3B9AB8F6A948BC5
C49D360886E704936A6678E1139D26B7819F7E90
A335926AA319A27A1D00896A6773A4827ACDAC73
D09E8800291CB85396CC6717393284AAA0DA64BA

金額是 US$12288,但是要五個都找到。

話說在寫這篇時,查資料發現 P-384 有獨立條目,但 P-256P-521 都是重導指到 Elliptic-curve cryptography 這個條目,但 P-384 看起來也沒什麼特別的,不知道當初編輯的人是怎麼想的...

回來原來的問題,要從一些背景開始講,橢圓曲線的表示法有多種,像是:

y^2 = x^3 + ax + b (Weierstrass form) y^2 = x^3 + ax^2 + bx (Montgomery form)

而這些常數 ab 的選擇會影響到計算速度,所以通常會挑過,但畢竟是密碼學用的東西,挑的過程如果都不解釋的話,會讓人懷疑是不是挑一個有後門的數字,尤其 NIST (NSA) 後來被證實在 Dual_EC_DRBG 裡面埋後門的醜聞,大家對於 NIST 選擇或是設計的密碼系統都有很多疑慮。

舉個例子來說,2005 年時 djb 發明了 Curve25519 (論文「Curve25519: new Diffie-Hellman speed records」則是記錄 2006),選擇的橢圓曲線是:

y^2 = x^3 + 486662x^2 + x

他就有提到這邊的 486662 是怎麼來的:他先在前一個段落說明,這邊數字如果挑的不好的話,會有哪些攻擊可以用,接下來把最小的三個值列出來,然後說明原因:

To protect against various attacks discussed in Section 3, I rejected choices of A whose curve and twist orders were not {4 · prime, 8 · prime}; here 4, 8 are minimal since p ∈ 1+4Z. The smallest positive choices for A are 358990, 464586, and 486662. I rejected A = 358990 because one of its primes is slightly smaller than 2^252, raising the question of how standards and implementations should handle the theoretical possibility of a user’s secret key matching the prime; discussing this question is more difficult than switching to another A. I rejected 464586 for the same reason. So I ended up with A = 486662.

而 P-192、P-224、P-256、P-384 與 P-521 的值都很怪,這是十六進位的值,在正式的文件或是正式的說明上都沒有解釋,屬於「magic number」:

3045AE6FC8422F64ED579528D38120EAE12196D5 # NIST P-192, ANSI prime192v1
BD71344799D5C7FCDC45B59FA3B9AB8F6A948BC5 # NIST P-224
C49D360886E704936A6678E1139D26B7819F7E90 # NIST P-256, ANSI prime256v1
A335926AA319A27A1D00896A6773A4827ACDAC73 # NIST P-384
D09E8800291CB85396CC6717393284AAA0DA64BA # NIST P-521

依照 Steve Weis 說,這些值當初是 Jerry Solinas 是隨便抓個字串,再用 SHA-1 生出來的:

Apparently, they were provided by the NSA, and generated by Jerry Solinas in 1997. He allegedly generated them by hashing, presumably with SHA-1, some English sentences that he later forgot.

這是 Steve Weis 的敘述,出自「How were the NIST ECDSA curve parameters generated?」:

[Jerry] told me that he used a seed that was something like:
SEED = SHA1("Jerry deserves a raise.")
After he did the work, his machine was replaced or upgraded, and the actual phrase that he used was lost. When the controversy first came up, Jerry tried every phrase that he could think of that was similar to this, but none matched.

如果可以證實當初的字串,那麼 NIST 在裡面埋後門的疑慮會再降低一些,這就是這次發起 bounty program 的原因。

Perl 的 Regular Expression 的強度:NP-complete

這篇稍微偏 CS 理論一些...

以前在學校學 Formal language 的時候會帶出 Grammer、Language、Automaton 三個項目,就像是維基百科上的條列:

裡面可以看到經典的 Regular expression 會被分到 RG/RL/FSM 這三塊。

前幾天看到 gugod 寫的「[Perl] 以正規表示式來定義文法規則」這篇,裡面試著用 Perlregular expression (perlre) 建構「遞歸下降解析器」 (Recursive descent parser)。

Recursive descent parser 可以當作是 CFG 的子集合,而 CFG 對應到的語言是 CFL,另外他對應到的自動機是 PDA

我們已經知道 perlre 因為支援一堆奇怪的東西 (像是 backreference 或是 recursive pattern),所以他能接受的 language 已經超過 RL,但我很好奇他能夠做到什麼程度。

用搜尋引擎翻了翻,查到對 PCRE 的分析 (這是一套與 Perl regular expression 語法相容的 library):「Which languages do Perl-compatible regular expressions recognize?」。

在裡面有人提到「The true power of regular expressions」這篇文章,裡面給了一個在 PTIME 演算法,將 3SAT 轉換到 PCRE 裡解,這證明了 PCRE 是 NP-hard;另外也很容易確認 PCRE 是 NP,所以就達成了 NP-complete 的條件了...

本來一直以為 PCRE 只是 CFG/CFL/PDA 而已,沒想到這麼強,NPC 表示大多數現有的演算法都可以轉成 PCRE 形式放進去跑... XD