The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
open source 不能限制使用,而 LLaMA 的授權限制了商業使用:
2. Additional Commercial Terms. If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
This PR adds GPU acceleration for all remaining ggml tensors that didn't yet have it. Especially for long generations this makes a large difference because the KV cache is still CPU only on master and gets larger as the context fills up.
蠻多人有不同測試的結果,要注意這次不是把 CPU 搬到 GPU 上面做,而是把本來因為比較 light 而還沒搬上 GPU 的部分搬上去,所以不會是數量級的加速,但看起來改善也已經很不賴了:
Early attempt this morning we're getting ~2.5-2.8x perf increase on 4090s and about 1.8-2x on 3090Ti.
From a fun side project just a few months ago, ggml has now become a useful library and framework for machine learning with a great open-source community
LLaMa is dethroned 👑 A brand new LLM is topping the Open Leaderboard: Falcon 40B 🛩
*interesting* specs: - tuned for efficient inference - licence similar to Unity allowing commercial use - strong performances - high-quality dataset also released
The license of the Falcon 40B model has just been changed to… Apache-2 which means that this model is now free for any usage including commercial use (and same for the 7B) 🎉 https://t.co/LZcmejPdf5
Falcon-40B was trained on AWS SageMaker, on 384 A100 40GB GPUs in P4d instances.
The Technology Innovation Institute (TII) is an Abu Dhabi government funded research institution that operates in the areas of artificial intelligence, quantum computing, autonomous robotics, cryptography, advanced materials, digital science,[4] directed energy and secure systems. The institute is a part of the Abu Dhabi Government’s Advanced Technology Research Council (ATRC).
This was accomplished by changing the file format so we can mmap() weights directly into memory without having to read() or copy them thereby ensuring the kernel can make its file cache pages directly accessible to our inference processes; and secondly, that the file cache pages are much less likely to get evicted (which would force loads to hit disk) because they're no longer competing with memory pages that were needlessly created by gigabytes of standard i/o.
另外 Justine Tunney 在這邊的 comment 有提到一個意外觀察到的現象,他發現實際在計算的時候用到的 model 內容意外的少:他用一個簡單的 prompt 測試,發現 20GB 的 30B model 檔案在他的 Intel 機器上實際只用到了 1.6GB 左右:
If I run 30B on my Intel machine:
[...]
As we can see, 400k page faults happen, which means only 1.6 gigabytes ((411522 * 4096) / (1024 * 1024)) of the 20 gigabyte weights file actually needed to be used.
這點他還在懷疑是不是他的修改有 bug,但目前他覺得不太像,也看不太出來:
Now, since my change is so new, it's possible my theory is wrong and this is just a bug. I don't actually understand the inner workings of LLaMA 30B well enough to know why it's sparse. Maybe we made some kind of rare mistake where llama.cpp is somehow evaluating 30B as though it were the 7B model. Anything's possible, however I don't think it's likely. I was pretty careful in writing this change, to compare the deterministic output of the LLaMA model, before and after the Git commit occurred. I haven't however actually found the time to reconcile the output of LLaMA C++ with something like PyTorch. It'd be great if someone could help with that, and possibly help us know why, from more a data science (rather than systems engineering perspective) why 30B is sparse.
> Explain FAANG.
FAANG refers to five of America's largest technology companies - Facebook, Apple Inc., Amazon, Netflix and Google (Alphabet). These are some of the most valuable brands in today’s marketplace with a combined value that exceeded $3 trillion as at 2018.
以及 30B 的版本:
> Explain FAANG.
FAANG stands for Facebook, Amazon, Apple, Netflix and Google - five of the most powerful technology companies in today's world. These tech giants have become increasingly influential over recent years due to their dominance in various markets such as social media platforms (Facebook), e-commerce websites (Amazon) or streaming services (Netflix).