Nvidia scraped videos from Youtube and several other sources to compile training data for its AI products, internal Slack chats, emails, and documents obtained by 404 Media show.
When asked about legal and ethical aspects of using copyrighted content to train an AI model, Nvidia defended its practice as being “in full compliance with the letter and the spirit of copyright law.” Internal conversations at Nvidia viewed by 404 Media show when employees working on the project raised questions about potential legal issues surrounding the use of datasets compiled by academics for research purposes and YouTube videos, managers told them they had clearance to use that content from the highest levels of the company.
I love it when marketing manages to spin Armageddon levels worth of copyright infringement into “spirit of the law” just because a program is magically called “AI”. Machine Learning is just pattern recognition software.
Software that runs on data assembled from petabytes of copyrighted information… And then promptly resold to us.
We may decide later on if it’s okay to do this. But I’m pretty sure that if it wasn’t for the labels we’d have legal WW3 happening right about now.