Statistical Method Identified to Detect Fake or Computer-Generated Text

Researchers at Harvard John A
Paulson School of Engineering and Applied Sciences (SEAS) and IBM have
developed a statistical method to detect computer-generated or fake text from
human generated text.
Researchers Sebastian Gehrmann
and Hendrik Strobert(IBM) found that natural-language generators are trained on
tens of millions of online texts and mimic human language by predicting the
words that most often follow one another. For eg I followed by 'have' and 'am'.
Using this idea they developed a method that identifies predictable text
instead of flagging errors. Gehrmann and Strobelt’s method,
known as GLTR, is based on a model trained on 45 million texts from websites —
the public version of the OpenAI model, GPT-2. Because it uses GPT-2 to detect
generated text, GLTR works best against GPT-2, but it does well against other
models, too.
https://news.harvard.edu/gazette/story/2019/07/researchers-develop-a-method-to-identify-computer-generated-text/?utm_medium=Feed&utm_source=Syndication
RECOMMENDED NEWS