ChatGPT is full of sensitive private information and spits out verbatim text from CNN, Goodreads, WordPress blogs, fandom wikis, Terms of Service agreements, Stack Overflow source code, Wikipedia pages, news blogs, random internet comments, and much more.
yea this “attack” could potentially sink closedAI with lawsuits.
This isn’t just an OpenAI problem:
If a model uses copyrighten work for training without permission, and the model memorized it, that could be a problem for whoever created it, open, semi open, or closed source.