this post was submitted on 21 Apr 2025
11 points (86.7% liked)
Technik
759 readers
109 users here now
die Community für alles, was man als Technik beschreiben kann
the community for everything you could describe as technology
Beiträge auf Deutsch oder Englisch
Posts in German or English
founded 10 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Using identically displayed but differently encoded characters is a way to watermark texts. It was used in a lawsuit a few years ago (SZ-Bericht). The suing company eventually lost because they didn't actually own the rights to the texts they had watermarked.
As @luckystarr@feddit.org points out, these whitespaces may make quite a difference, so not likely to be a watermark. Methods for watermarking LLM-generated Text are more subtle anyway, involving altering word frequencies.