用 AI 模型判斷是否為 AI 產生的文字

OpenAI 放出了新的 model,可以用來判斷是否為 AI 產生的文字:「New AI classifier for indicating AI-written text」。

但目前的成效其實還是不太行,只以英文的成效來看,true positive 只有 26%,而 false positive 是 9%:

In our evaluations on a “challenge set” of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives).

另外也有提到弱點,像是比較短的內容機很難辨認:

The classifier is very unreliable on short texts (below 1,000 characters). Even longer texts are sometimes incorrectly labeled by the classifier.

然後就是有正確答案的內容也很難辨認,因為正確答案幾乎都是一樣的:

Text that is very predictable cannot be reliably identified. For example, it is impossible to predict whether a list of the first 1,000 prime numbers was written by AI or humans, because the correct answer is always the same.

另外題到了技術上的限制,現在的方法比較像是「辨認是不是從某些 corpus 訓練出來的 model,所產生的文字」,而非通用性的 AI 文字偵測:

Classifiers based on neural networks are known to be poorly calibrated outside of their training data. For inputs that are very different from text in our training set, the classifier is sometimes extremely confident in a wrong prediction.

看起來是還不到可以用的程度...

Leave a Reply

Your email address will not be published. Required fields are marked *