問 LLM 台灣是不是獨立國家...

Hacker News 上看到「Comparing 60 LLMs with a set of 20 prompts (llmonitor.com)」這篇 (看到的時候在第一名),原文在「Asking 60+ LLMs a set of 20 questions」這邊。

作者寫了 20 個問題讓一堆 LLM 回答,把回答的結果以及時間記錄起來,其中看到「Is Taiwan an independent country?」這個問題,共有 54 個 LLM 的結果,可以看到各家 LLM 的回答。

作者雖然註解解釋這題的回答是「是」(Note: Correct answer: yes it is.),但考慮到訓練的語料,大多數的回答都會提到全世界的政治情勢,或是帶出「這個問題很複雜」的說明。

不過我就是想看其他類型的回答 XD

直接拒絕回答,出現空白的有 Code Llama Instruct (7B)、Dolly v2 (3B)、Dolly v2 (7B)、Falcon Instruct (7B)、Koala (13B)、Luminous Supreme Control、Vicuna v1.3 (7B)。

然後出現沒意義的輸出的是 Vicuna v1.5 (13B),這邊丟出 48 行的 <bot>:,沒有其他內容。

另外一個頗歡樂的回答是 Vicuna v1.3 (13B),直接出現簡體中文回答的,而且獨立了 XDDD

台湾是一个独立的国家。

這邊 Koala 與 Vicuna 系列的都是 LMSYS 的作品,這邊的學生團隊 (Student Team) 都是華人名字:「About | LMSYS Org」,可能是丟了不少中文資料進去才會冒出簡體中文的回答?

另外一個有趣的是 Databricks 的 Dolly v2 (12B) 的回答直接說「不是」,沒有人和其他解釋:

<bot>: No.

而 Dolly 在專案頁面上有提到是基於 pythia-12b

Based on pythia-12b, Dolly is trained on ~15k instruction/response fine tuning records databricks-dolly-15k generated by Databricks employees in capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.

不過回頭看同樣是 Pythia 家系的 Open-Assistant Pythia SFT-4 (12B),他的回答是:

<bot>: Yes, Taiwan is an independent country.

以及 Pythia-Chat-Base (7B) 的回答:

Yes, Taiwan is an independent country.

所以 Databricks 怎麼 train 的,把 pythia 的結果直接反過來 XDDD

紐約州通過法案,禁止「競業條款」

Hacker News Daily 上看到「New York State Senate passes prohibitions on non-competes (ogletree.com)」這篇,原報導在「New York State Senate Passes Prohibitions on Non-Competes」這。

在原報導裡面給的連結就是紐約州的官方連結,提到了兩個法案:

  • Senate Bill S3100A: Prohibits non-compete agreements and certain restrictive covenants
  • Senate Bill S6748: Relates to actions or practices that establish or maintain a monopoly, monopsony or restraint of trade, and authorizes a class action lawsuit in the state anti-trust law

可以看到兩個都已經通過參議院了,下一步看起來就是送給州長了;其中 S3100A 就是這次提到的反「競業條款」法案,裡面最重要的內容也很簡單,就是直接禁止禁業條款:

2. NO EMPLOYER OR ITS AGENT, OR THE OFFICER OR AGENT OF ANY CORPORATION, PARTNERSHIP, LIMITED LIABILITY COMPANY, OR OTHER ENTITY, SHALL SEEK, REQUIRE, DEMAND OR ACCEPT A NON-COMPETE AGREEMENT FROM ANY COVERED INDIVIDUAL.

在這條前面有定義什麼是「人」與「NDA」,後面有救濟措施以及一些避免鑽法律漏洞的敘述。

等正式通過後對整個美國的影響應該會不小?應該會有一陣子觀望,然後看結果後可能會有其他州也加入...

比利時合法化「道德滲透 (ethical hacking)」法案

Hacker News 上看到的「Belgium legalises ethical hacking (law.kuleuven.be)」,原文在「Belgium legalises ethical hacking: a threat or an opportunity for cybersecurity?」,比利時政府官方的荷蘭語與法語的 PDF 檔案在這邊可以取得,但裡面包括了其他法案的資訊,這邊是讀英文版的文章...

標題提到的 ethical hacking 不確定有沒有比較好的中文詞彙,先暫定用這個。

先講結論,看完以後可以感覺到是個很糟的法案,應該會本來灰色地帶的 ethical hacking 全部打進黑色或直接全部放 0-day?

要符合比利時法律裡面的 ethical hacking 有四個條件:

The first condition set by the law is that ethical hackers cannot have the intent to cause harm or to obtain illegitimate benefits with their activities. The law therefore excludes that ethical hackers request payment in order to reveal any potential vulnerabilities that they discovered, unless this has been agreed upon in advance, for example as part of a bug bounty programme or a CVDP. Extorsion is not an activity endorsed by the law.

第一條的限制包括了不得取得利益,除非單位已經有提供 bug bounty program 之類的獎勵。

The second condition mandates that ethical hackers report any uncovered cybersecurity vulnerability as soon as possible to the Centre for Cyber Security Belgium (CCB), which is the national computer security incident response team of Belgium. Ethical hackers also need to report their findings to the organisation they were investigating, the latest at the time they are notifying the CCB over a vulnerability.

第二條是強制要回報給政府單位 (CCB),加上第一條的限制,所以是要免費提供給政府。

The third condition requires ethical hackers to not go further in their hacking than necessary and proportionate in order to uncover a cybersecurity vulnerability. Ethical hackers have to limit themselves to those activities that are strictly necessary for the objective of notifying a cybersecurity vulnerability. This condition is for example breached if a vulnerability is discoverable with less intrusive means than those chosen by the ethical hacker. Ethical hackers are also required to ensure that their activities do not affect the availability of the services of the organisation under investigation.

第三條限制滲透行為只限於證明弱點或是漏洞。

The final condition is an obligation for ethical hackers to not disclose information about the uncovered vulnerability to a broader public without the consent of the CCB. Ethical hackers can therefore not report on uncovered cybersecurity vulnerabilities in the media, for example by noting it in a blog post, unless they have the authorisation of the CCB.

第四點是政府沒有同意以前不得列漏給其他人。

NIST 更新了 SHA-1 的淘汰計畫

NISTSHA-1 的新的淘汰計畫出來了:「NIST Retires SHA-1 Cryptographic Algorithm」。

先前 NIST 在 2004 年時是計畫在 2010 年淘汰掉 SHA-1,在「NIST Brief Comments on Recent Cryptanalytic Attacks on Secure Hashing Functions and the Continued Security Provided by SHA-1」這邊可以看到當時的宣佈:

The results presented so far on SHA-1 do not call its security into question. However, due to advances in technology, NIST plans to phase out of SHA-1 in favor of the larger and stronger hash functions (SHA-224, SHA-256, SHA-384 and SHA-512) by 2010.

但看起來當時沒有強制性,所以事情就是一直拖一直延期,中間經過了 2017 年 GoogleCWI Amsterdam 展示的 SHA-1 collision:「Google 與 CWI Amsterdam 合作,找到 SHA-1 第一個 collision」。

以及 2020 年時的進展與分析,發現 chosen-prefix collision 已經是可行等級了:「SHA-1 的 chosen-prefix collision 低於 2^64 了...」。

然後 NIST 總算是想起來要更新 phase out 的計畫,現在最新的計畫是在 2030 年年底淘汰掉 SHA-1:

As today’s increasingly powerful computers are able to attack the algorithm, NIST is announcing that SHA-1 should be phased out by Dec. 31, 2030, in favor of the more secure SHA-2 and SHA-3 groups of algorithms.

這次就有一些強制的規範了,包括採購的部份:

“Modules that still use SHA-1 after 2030 will not be permitted for purchase by the federal government,” Celi said.

但 2030 年聽起來還是有點慢...

紐約州在推動電子產品的維修權

在清 Hacker News Daily 的時候看到「New York could become first state with a ‘Right to Repair’ law for electronic devices」這篇,在講紐約州有團體在推動電子產品的維修權。

先前有提過歐盟對電子產品的維修權有在推動法案 (參考「歐盟在推動的設備維修權...」這篇),確保十年內有料可以維修,後來這個法案已經生效了:「New EU ‘right to repair’ laws require technology to last for a decade」。

可以觀察一下會不會過...

美國政府禁止 NVIDIA 將高階顯卡輸出到中國與俄羅斯

Hacker News 首頁上看到「US Government Bans Export of Nvidia A100 and H100 GPUs to China and Russia (sec.gov)」這篇,是 NVIDIA 發出了 Form 8-K,說明美國政府禁止 A100 與 H100 或是更高階 (更快) 的卡以及產品輸出到中國 (包括香港) 與俄羅斯:「nvda-20220826.htm」。

先是指出 A100、H100 以及 A100X (Ampere) 被管制:

On August 26, 2022, the U.S. government, or USG, informed NVIDIA Corporation, or the Company, that the USG has imposed a new license requirement, effective immediately, for any future export to China (including Hong Kong) and Russia of the Company’s A100 and forthcoming H100 integrated circuits. DGX or any other systems which incorporate A100 or H100 integrated circuits and the A100X are also covered by the new license requirement.

另外是禁止新產品的部份,效能與 A100 相等或是更好的卡也被禁止輸出,除非有取得授權:

The license requirement also includes any future NVIDIA integrated circuit achieving both peak performance and chip-to-chip I/O performance equal to or greater than thresholds that are roughly equivalent to the A100, as well as any system that includes those circuits.

然後有提到軍事相關考量:

A license is required to export technology to support or develop covered products. The USG indicated that the new license requirement will address the risk that the covered products may be used in, or diverted to, a ‘military end use’ or ‘military end user’ in China and Russia. The Company does not sell products to customers in Russia.

有看到一些報導指出 AMD 也有收到類似的禁令 (畢竟也是個顯卡大廠),但在「SEC Filings」這邊沒看到...

白宮宣佈由政府資助的研究,都必須馬上公開

一樣是 Hacker News 上看到的:「Guidance to make federally funded research freely available without delay (whitehouse.gov)」,白宮的公告在「OSTP Issues Guidance to Make Federally Funded Research Freely Available Without Delay」這邊。

開頭有重點,不得限制以及收費。所以 paywall 是一定不行,另外要註冊才能看也算是一種限制,應該也會被這次的政策要求改善:

In a memorandum to federal departments and agencies, Dr. Alondra Nelson, the head of OSTP, delivered guidance for agencies to update their public access policies as soon as possible to make publications and research funded by taxpayers publicly accessible, without an embargo or cost.

時間表的部份,短期是 2023 年中更新 policy,並且在 2025 年年底前全部施行:

In the short-term, agencies will work with OSTP to update their public access and data sharing plans by mid-2023. OSTP expects all agencies to have updated public access policies fully implemented by the end of 2025.

這次的算政府方面的政策,至少這些論文會有地方可以公開下載。

找了一下之前寫下來跟 open access 有關的消息,從學校方面給壓力的也不少,不過我記錄下來的主要都是跟 Elsevier 的中止合約:

看起來不同角度都有一些推進...

歐盟通過 Digital Markets Act 與 Digital Services Act

Hacker News Daily 上翻的時候看到的大消息,歐盟通過了 Digital Markets Act (DMA) 與 Digital Services Act (DSA):「EU Approves Landmark Legislation to Regulate Apple and Other Big Tech Firms」,這兩個法案會直接衝擊大企業壟斷的情況。

找了一下中文的資料,iThome 有報導:「歐洲議會通過《數位服務法》與《數位市場法》!傳訊服務必須互通,不得禁止使用者採用第三方App Store」。

其中 MacRumors 上的文章整理的蠻清楚的,DMA 包括了:

  • Allow users to install apps from third-party app stores and sideload directly from the internet.
  • Allow developers to offer third-party payment systems in apps and promote offers outside the gatekeeper's platforms.
  • Allow developers to integrate their apps and digital services directly with those belonging to a gatekeeper. This includes making messaging, voice-calling, and video-calling services interoperable with third-party services upon request.
  • Give developers access to any hardware feature, such as "near-field communication technology, secure elements and processors, authentication mechanisms, and the software used to control those technologies."
  • Ensure that all apps are uninstallable and give users the ability to unsubscribe from core platform services under similar conditions to subscription.
  • Give users the option to change the default voice assistant to a third-party option.
  • Share data and metrics with developers and competitors, including marketing and advertising performance data.
  • Set up an independent "compliance function" group to monitor its compliance with EU legislation with an independent senior manager and sufficient authority, resources, and access to management.
  • Inform the European Commission of their mergers and acquisitions.

可以看出來除了最後兩項是針對 EU 的監管機制外,其他的包括了安裝來自第三方的軟體、可以使用第三方的付款系統、可以整合系統服務、可以整合硬體功能、可以使用第三方的語音工具、可以反安裝所有的 app 以及提供平台蒐集到的資料給開發者,都是針對現在 AppleApp StoreGoogle Play 所限制的條件。

另外 DMA 也禁止了這些行為:

  • Pre-install certain software applications and require users to use any important default software services such as web browsers.
  • Require app developers to use certain services or frameworks, including browser engines, payment systems, and identity providers, to be listed in app stores.
  • Give their own products, apps, or services preferential treatment or rank them higher than those of others.
  • Reuse private data collected during a service for the purposes of another service.
  • Establish unfair conditions for business users.

而 DSA 的部份則是針對網路上的非法內容處理:

The Digital Services Act (DSA), which requires platforms to do more to police the internet for illegal content, has also been approved by the European Parliament.

其中 DMA 的生效日看起來會在 2023 年年中生效?應該是 六個月加上六個月...

Once formally adopted, the Act, which takes the legal form of a Regulation, will enter into force 20 days after publication in the EU Official Journal and will apply six months later. The designated gatekeepers will have a maximum of six months after the designation decision by the Commission to ensure compliance with the obligations laid down in the Digital Markets Act.

而 DSA 至少要到 2024 年才有機會會實施:

Once adopted, the DSA will be directly applicable across the EU and will apply fifteen months or from 1 January 2024, whichever later, after entry into force.

歐盟的市場夠大,這個應該會帶來足夠大的衝擊...

BBC 這次拿出短波廣播...

Hacker News Daily 上看到的,BBC 這次戰爭拿出短波廣播發送訊號,讓烏克蘭地區的人,以及一部分俄羅斯的人可以收到 BBC 的新聞:「BBC resurrects WWII-era shortwave broadcasts as Russia blocks news of Ukraine invasion」。

The BBC says its shortwave broadcasts will be available on frequencies of 15735 kHz from 4PM to 6PM and 5875 kHz from 10PM to midnight, Ukraine time. News will be read in English, which the BBC says will be available in Kyiv as well as “parts of Russia.”

主要還是用到短波廣播可以傳很遠,以及難以封鎖的特性,相較於 internet 容易被牆掉所以被拿來用...

另外 BBC 也提供了 Onion 的版本,讓俄羅斯的人可以翻出來看 BBC 的新聞:

The BBC’s current onion domain is: https://www.bbcnewsd73hkzno2ini43t4gblxvycyac5aw4gnv7t2rccijh7745uqd.onion.

不過有 Tor 的話基本上可以直接從 exit node 看,好像沒有必要特別用 Onion 位置...

法國 CNIL 認為 Google Analytics 傳輸資料回美國違反 GDPR

先前提過德國認為沒有告知使用者網站使用 Google Fonts 違反 GDPR (可以參考先前寫的「德國的地方法院說使用 Google Fonts 服務沒有告知使用者違反 GDPR」這篇),這次法國的 CNIL (英文維基百科的介紹:「Commission nationale de l'informatique et des libertés」,是法國政府的一個獨立單位) 認定 Google Analytics 將資料傳回美國違反 GDPR:「Use of Google Analytics and data transfers to the United States: the CNIL orders a website manager/operator to comply」。

文章的 summary 講的差不多:

Google Analytics provides statistics on website traffic. After receiving complaints from the NOYB association, the CNIL, in cooperation with its European counterparts, analysed the conditions under which the data collected through this service is transferred to the United States. The CNIL considers that these transfers are illegal and orders a French website manager to comply with the GDPR and, if necessary, to stop using this service under the current conditions.

這件事情在 Hacker News 上的討論很熱烈,這邊就不爆雷了:「Use of Google Analytics declared illegal by French data protection authority (cnil.fr)」,在看的時候要知道 Hacker News 是非常美國觀點的站台 (偏 Y Combinator 或是 VC 圈子觀點)。