I wrote on this topic a few months ago. In, Chasing shadows: Is AI text detection a critical need or a fool's errand?. That links back to an interesting article and YouTube Video.
The authors discussed "whitebox" methods, including watermarking like you describe here. In addition to the challenges that you mentioned above, they also pointed out that watermarking is susceptible to reverse engineering and that there's a tradeoff between the quality of the watermark and the quality of the text, itself.
Overall, I'm not optimistic. From my article, here's a summary of my position after reviewing those two sources:
In the end, I think maybe we're just going to have to understand that there's a human owner of the text, and that person is ultimately responsible for what was said - regardless of whether an LLM was used as an intermediary. Then, the risks of "phishing, disinformation, and academic dishonesty" would be addressed by ethics and laws, not by technology - as they have always been.
I am not optimistic either. It takes a lot less effort to undermine these fingerprinting mechanisms than it does to create a highly accurate one.
In the end, I believe it must be a society change of perspective where we accept that everything we see may be altered or fabricated. Just like with images today.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit