在 LLM 裡面講的 Open 不是 open-source license 的定義,比較接近「免費使用」而已,通常會帶有限制。
但即使放寬到「免費使用」,LLaMA 65B 從二月放出來 (或者說「被放出來」) 已經領頭領了三個多月了,直到上個禮拜看到被 Falcon 40B 超越的消息:
LLaMa is dethroned 👑 A brand new LLM is topping the Open Leaderboard: Falcon 40B 🛩
*interesting* specs:
- tuned for efficient inference
- licence similar to Unity allowing commercial use
- strong performances
- high-quality dataset also releasedCheck the authors' thread 👇 https://t.co/vojobBXFQT pic.twitter.com/BuOLnHebhU
— Thomas Wolf (@Thom_Wolf) May 26, 2023
在「Open LLM Leaderboard」這邊的 benchmark 可以看到除了 TruthfulQA (0-shot) 以外,其他的都領先,而綜合平均值也是領先的:
而往下拉可以看到 7B 的版本表現也不錯,之後應該也可以再 tune。
更重要的是,剛剛看到這個 model 把授權改成 Apache License 2.0 的消息,這所以 LLaMA 的替代方案總算有樣子了:
The license of the Falcon 40B model has just been changed to… Apache-2 which means that this model is now free for any usage including commercial use (and same for the 7B) 🎉 https://t.co/LZcmejPdf5
— Thomas Wolf (@Thom_Wolf) May 31, 2023
另外看了一下,這包 model 是在 AWS 的 SageMaker 上面幹出來的,翻了一下 Technology Innovation Institute,真不愧是有錢的單位:
Falcon-40B was trained on AWS SageMaker, on 384 A100 40GB GPUs in P4d instances.
The Technology Innovation Institute (TII) is an Abu Dhabi government funded research institution that operates in the areas of artificial intelligence, quantum computing, autonomous robotics, cryptography, advanced materials, digital science,[4] directed energy and secure systems. The institute is a part of the Abu Dhabi Government’s Advanced Technology Research Council (ATRC).
在 Hacker News 上有人已經跑起來了,而且是透過 InstructGPT 調教過的版本:「Falcon 40B LLM (which beats Llama) now Apache 2.0 (twitter.com/thom_wolf)」,據說 4-bit quantized 版本可以在 40GB 的 A100 或是兩張 24GB 的 3090/4090 跑起來。
另外 ggml 的人應該這幾天就會動起來了,可以讓子彈再放著飛一下...