I've seen this question asked repeatedly in many LLaMa threads, currently the best models that are truly open are the released models from the Flan family by Google, which includes Flan-T5[0] and Flan-UL2[1]. According to its paper, Flan-UL2 performs slightly better than Flan-T5-XXL.
These models perform slightly better than GPT-3 under some tasks[2], but they're still far from achieving the results from GPT-3.5 and GPT-4. This becomes evident when you try to use them in the real world; they're not "good enough" for general use cases, unlike ChatGPT models. However, if you can restrict your use case to one particular domain, you can achieve pretty good results by further fine-tuning these models.
另外一則回覆有提到一些其他的 model:
The ones I saw mentioned so far were Flan, Cerebras, GPT-J, and RWKV.
Not yet mentioned:
* Pythia https://github.com/EleutherAI/pythia
* GLM-130B https://github.com/THUDM/GLM-130B - see also ChatGLM-6B https://github.com/THUDM/ChatGLM-6B
In the context of file system TOCTOU race conditions, the fundamental challenge is ensuring that the file system cannot be changed between two system calls. In 2004, an impossibility result was published, showing that there was no portable, deterministic technique for avoiding TOCTOU race conditions.
其中一種 mitigation 是針對 fd 監控:
Since this impossibility result, libraries for tracking file descriptors and ensuring correctness have been proposed by researchers.
然後另外一種方式 (比較治本) 是檔案系統的 API 支援 transaction,但看起來不被主流接受?
An alternative solution proposed in the research community is for UNIX systems to adopt transactions in the file system or the OS kernel. Transactions provide a concurrency control abstraction for the OS, and can be used to prevent TOCTOU races. While no production UNIX kernel has yet adopted transactions, proof-of-concept research prototypes have been developed for Linux, including the Valor file system and the TxOS kernel. Microsoft Windows has added transactions to its NTFS file system, but Microsoft discourages their use, and has indicated that they may be removed in a future version of Windows.
In April 2021, the Supreme Court ruled in a 6–2 decision that Google's use of the Java APIs fell within the four factors of fair use, bypassing the question on the copyrightability of the APIs. The decision reversed the Federal Circuit ruling and remanded the case for further review.
判決全文 PDF 的前面三頁多算是簡介說明這次的重點,Page 44 到 Page 62 則是反對的兩位大法官 (Clarence Thomas 與 Samuel Alito) 所提出的異議,可以看到兩位大法官批評了 copyrightability 與 fair-use analysis 的問題。
You can only use committed use discounts for predefined machine types and custom machine types. Small machine types, such as f1-micro and g1-small, are not eligible for committed use discounts.
目前是 beta:
This is a Beta release of Committed Use Discounts. This feature is not covered by any SLA or deprecation policy and may be subject to backward-incompatible changes.
“Google’s unauthorized digitizing of copyright-protected works, creation of a search functionality and display of snippets from those works are non-infringing fair uses,” U.S. Circuit Judge Pierre Leval wrote on behalf of the court. “The purpose of the copying is highly transformative, the public display of text is limited and the revelations do not provide a significant market substitute for the protected aspects of the originals.”
The Electronic Frontier Foundation (EFF) filed suit against Universal Music Publishing Group (UMPG) asking a federal court to protect the fair use and free speech rights of a mother who posted a short video of her toddler son dancing to a Prince song on the Internet.
Stephanie Lenz's 29-second recording shows her son bouncing along to the Prince song "Let's Go Crazy " which is heard playing in the background. Lenz uploaded the home video to YouTube in February to share it with her family and friends.
In late June 2007, Lenz sent YouTube a counter-notification, claiming fair use and requesting the video be reposted. Six weeks later, YouTube reposted the video. In July 2007, Lenz sued Universal for misrepresentation under the DMCA and sought a declaration from the court that her use of the copyrighted song was non-infringing. According to the DMCA 17 U.S.C. § 512(c)(3)(A)(v), the copyright holder must consider whether use of the material was allowed by the copyright owner or the law.
而環球直接挑明不在意 fair use:
In September 2007, Prince released statements that he intended to "reclaim his art on the internet." In October 2007, Universal released a statement amounting to the fact that Prince and Universal intended to remove all user-generated content involving Prince from the internet as a matter of principle.
The district court held that copyright owners must consider fair use before issuing DMCA takedown notices. Thus, the district court denied Universal's motion to dismiss Lenz's claims, and declined to dismiss Lenz's misrepresentation claim as a matter of law.
同時認為環球濫用 DMCA takedown notification:
The district court believed that Universal's concerns over the burden of considering fair use were overstated, as mere good faith consideration of fair use, not necessarily an in-depth investigation, is sufficient defense against misrepresentation. The court also explained that liability for misrepresentation is crucial in an important part of the balance in the DMCA.
The panel held that the DMCA requires copyright holders to consider fair use before sending a takedown notification, and that failure to do so raises a triable issue as to whether the copyright holder formed a subjective good faith belief that the use was not authorized by law.