Does AI labelling for content actually work yet?

I’m watching the development of AI-generated text labelling with real fascination. OpenAI has a validator that tries to determine whether text was probably written by a human or a robot.

I could imagine Google showing a little “This was automatically generated” label in search results. Not sure how well that’ll go down, given content creators are frantically pumping out material with ChatGPT and other AI bots right now.

Made with AI: good thing or not so much?

What happens when content from a respected company or influencer suddenly gets tagged with “AI”? Will readers feel deceived because the content came from a machine instead of a human? Or will it play out differently?

Maybe human-created content is less valuable to readers than we arrogantly assume. Think about “Made in Germany” or “Made in Japan.” These are quality markers now, but they started as warnings after World War II about inferior products from struggling economies. Made in Germany? Better steer clear. Now everyone wants products from these self-proclaimed quality champions.

Voluntary self-regulation won’t work

Will the same happen with AI-generated content? Should creators voluntarily declare what percentage is AI? (Never going to happen.) Or should Google or Microsoft do it automatically and spend their time arguing with content creators who dispute their “mostly AI” tags, claiming they write “structured” and “print-ready” prose?

Strip away the ethical grey zones and problem cases for a moment.

What if a human conceived and wrote text, then used AI to edit and polish it? Should that get marked as “AI”? These are difficult questions nobody’s addressing. We’re too busy generating as much AI text as possible. Feels like Napster — too good to be true; use it while it works before the lawsuit wave kills it.

Early attempts show labelling doesn’t work properly yet

How good is automatic marking? I ran some quick experiments. Nothing scientific, just enough for a first impression. The verdict: Meh. Doesn’t work yet.

I tested by copying English text from my English website to see what the system would spit out. I wrote it myself, only cleaned it up with a language checker (LanguageTools). Would that trigger suspicion? The result was “highly unlikely.” Relieved, but not surprised.

Are AI-edited texts already AI-generated content?

Then I ran that text through ChatGPT with “make this better and prettier.” It read completely differently afterwards — not necessarily better, but smoother, more boring, without my quirky wordplay. I fed the result into the validator and got another “highly unlikely.”

Next test: I created text entirely with ChatGPT, refining it multiple times with content instructions. Nothing I’d publish here, but it would serve as a foundation for a post. The interesting part isn’t the texts themselves, but the surprising impulses that inspire further questioning. That definitely improves the writing. The result: “likely AI written” — even though I hadn’t touched the text.

AI only effective with experienced “AI prompters”

What does this mean for me as a reader? When I know text was automatically generated, it leaves a bad taste. Maybe because I don’t know what intelligence sat behind the AI machine. I still trust human-written content more.

The more I work with ChatGPT, the more I notice it performs far better in the hands of experienced knowledge workers. No wonder publishers and agencies are hunting for “AI prompters.” Let’s hope they don’t just hire young social media specialists, but put experienced old-school journalists and fact checkers in front of the machines too. In that case, an “AI assisted” label might actually become a quality mark. Like “Made in Germany” is now.


From the archives of reinergaertner.de, running since 1997. Translated with AI help and my questionable bilingual proofreading. If you spot a Germanismus — that’s a feature, not a bug.