The arms race continues between those attempting to detect GenAI-created content and those who want to keep their origins concealed. For example, detecting if ChatGPT was employed to write content, such as academic papers. According to reports, OpenAI has built a subtle watermarking system, based upon words chosen by its own ChatGPT system, that is an embedded indicator for AI generation. Although highly accurate, it only works on OpenAI’s ChatGPT system and not on AI-generated content created from other systems. It also can be intentionally circumvented by running the content through other systems or filters.
We have seen many GenAI detection systems come and go. They emerge with promise, only to be undermined quickly. This is not the first AI text detector that OpenAI has created. The previous version was withdrawn due to a rapid decline in accuracy.
With the rise of deepfakes, there has been more focus on consistently detecting fabricated content, but nothing long-lasting has emerged.
I wrote on this topic a few months ago. In, Chasing shadows: Is AI text detection a critical need or a fool's errand?. That links back to an interesting article and YouTube Video.
The authors discussed "whitebox" methods, including watermarking like you describe here. In addition to the challenges that you mentioned above, they also pointed out that watermarking is susceptible to reverse engineering and that there's a tradeoff between the quality of the watermark and the quality of the text, itself.
Overall, I'm not optimistic. From my article, here's a summary of my position after reviewing those two sources:
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
I am not optimistic either. It takes a lot less effort to undermine these fingerprinting mechanisms than it does to create a highly accurate one.
In the end, I believe it must be a society change of perspective where we accept that everything we see may be altered or fabricated. Just like with images today.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Good luck with trying to keep up!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit