Targeting LLM-Generated Text: An Analogy

What is this?
Date2023/06/05
In this response on Hacker News, Andrew draws on a driving analogy to explain why it's easier to specifically target LLM-generated text, rather than simply relying on anti-spam rules.?

In response to a question about why we should target LLM-generated text specifically, rather than relying on anti-spam rules

Consider as an analogy, safety while driving. Driving dangerously is fairly uniformly proscribed, but we also have specific prohibitions on various mechanisms by which people drive dangerously. It's much easier to prove that someone was intoxicated or using a mobile phone, rather than needing to show in each case that the intoxication was dangerous or that the mobile phone use was distracting and the distraction was dangerous.

Similarly, use of LLMs to generate and post large quantities of text from a short prompt is inherently spamming. The user could provide exactly the same value by posting their prompt. If the prompt isn't valuable, the output won't be either, as anyone with a copy of the prompt could generate equivalent output for themselves if they wished to.

As an addendum, to not (I hope) distract from the point about spam -- I don't at all object to using an LLM for inspiration, or editing, or summarisation. So long as there's no claim that the output contains more information than the input. Any statement of fact in the output should be present in the prompt or validated by a human before publishing, or it's suspect. And if it's published without disclosing the lack of validation, it's unethical.