LLaMa is dethroned 👑 A brand new LLM is topping the Open Leaderboard: Falcon 40B 🛩
*interesting* specs: - tuned for efficient inference - licence similar to Unity allowing commercial use - strong performances - high-quality dataset also released
The license of the Falcon 40B model has just been changed to… Apache-2 which means that this model is now free for any usage including commercial use (and same for the 7B) 🎉 https://t.co/LZcmejPdf5
Falcon-40B was trained on AWS SageMaker, on 384 A100 40GB GPUs in P4d instances.
The Technology Innovation Institute (TII) is an Abu Dhabi government funded research institution that operates in the areas of artificial intelligence, quantum computing, autonomous robotics, cryptography, advanced materials, digital science,[4] directed energy and secure systems. The institute is a part of the Abu Dhabi Government’s Advanced Technology Research Council (ATRC).
雖然 Matt Mullenweg 在文章裡都沒提到,但 WordPress 的興起其實跟當年 2004 年最大的 blog 軟體 Movable Type 自己出的包有很大的關係:
With the release of version 3.0 in 2004, there were marked changes in Movable Type's licensing, most notably placing greater restrictions on its use without paying a licensing fee. This sparked criticism from some users of the software, with some moving to the then-new open-source blogging tool WordPress. With the release of Movable Type 3.2, the ability to create an unlimited number of weblogs at all licensing levels was restored. In Movable Type 3.3, the product once again became completely free for personal users.
當年 hlb 在社團主機上用 Movable Type 架了服務讓大家寫,結果後來發生了 license 問題,大家就都順勢跑到 WordPress 上了;而等到 Movable Type 再次想放寬 license 的時候已經來不及了,大家都已經搬完了。
I've seen this question asked repeatedly in many LLaMa threads, currently the best models that are truly open are the released models from the Flan family by Google, which includes Flan-T5[0] and Flan-UL2[1]. According to its paper, Flan-UL2 performs slightly better than Flan-T5-XXL.
These models perform slightly better than GPT-3 under some tasks[2], but they're still far from achieving the results from GPT-3.5 and GPT-4. This becomes evident when you try to use them in the real world; they're not "good enough" for general use cases, unlike ChatGPT models. However, if you can restrict your use case to one particular domain, you can achieve pretty good results by further fine-tuning these models.
另外一則回覆有提到一些其他的 model:
The ones I saw mentioned so far were Flan, Cerebras, GPT-J, and RWKV.
Not yet mentioned:
* Pythia https://github.com/EleutherAI/pythia
* GLM-130B https://github.com/THUDM/GLM-130B - see also ChatGLM-6B https://github.com/THUDM/ChatGLM-6B
We’re making GitHub Copilot, an AI pair programmer that suggests code in your editor, generally available to all developers for $10 USD/month or $100 USD/year. It will also be free to use for verified students and maintainers of popular open source projects.
In 2015, ISC announced they would release their Kea DHCP Software under the Mozilla Public License 2.0, stating, "There is no longer a good reason for ISC to have its own license, separate from everything else". They also preferred a copyleft license, stating, "If a company uses our software but improves it, we really want those improvements to go back into the master source". Throughout the following years, they re-licensed all ISC-hosted software, including BIND in 2016 and ISC DHCP Server in 2017.
The run time libraries were dual licensed under the UIUC and MIT license; the rest of the code only under the UIUC license. Therefore, we could not easily move code to run time libraries from other parts. The reason run time libraries were dual licensed was to enable linking to run time library binaries without requiring attribution to LLVM.
As an exception, if, as a result of your compiling your source code, portions of this Software are embedded into an Object form of such source code, you may redistribute such embedded portions in such Object form without complying with the conditions of Sections 4(a), 4(b) and 4(d) of the License.
In addition, if you combine or link compiled forms of this Software with software that is licensed under the GPLv2 ("Combined Software") and if a court of competent jurisdiction determines that the patent provision (Section 3), the indemnity provision (Section 9) or other Section of the License conflicts with the conditions of the GPLv2, you may retroactively and prospectively choose to deem waived or otherwise exclude such Section(s) of the License, but only in their entirety and only with respect to the Combined Software.