8HoursMining cloud mining platform, daily profits up to $9,337
Section: Business
The artificial intelligence organization OpenAI is currently under investigation for allegedly utilizing works from O'Reilly Media, a prominent US technology publisher, to train its GPT-4o model without appropriate authorization. This claim arises from a recent study conducted by the AI Disclosures Project, which includes input from O'Reilly's founder and CEO, Timothy O'Reilly.
According to the research, OpenAI reportedly relied on at least 34 O'Reilly titles during the training of GPT-4o. The study further examined two other models, GPT-3.5 Turbo and GPT-4o mini, but found less conclusive evidence regarding potential copyright infringements associated with these particular models.
In their analysis, the researchers posed a variety of multiple-choice questions to the OpenAI models. One of the answer options contained a direct quote from one of the 34 O'Reilly books, while the other choices were paraphrased versions. The study encompassed nearly 14,000 excerpts from these works. If the AI model correctly identified the verbatim quote, it was interpreted as an indication that the model had been trained using copyrighted material from the O'Reilly collection.
The researchers calculated an AUROC score, a statistical measure indicating the likelihood of the OpenAI models having been trained on O'Reilly's books. The score for GPT-4o reached 82 percent, suggesting a substantial probability that the copyrighted content was utilized in the training process. Additionally, the researchers speculated that OpenAI might have accessed a database from the shadow library, Library Genesis, which reportedly includes all 34 books in question.
Conversely, the study indicated that the significance of non-public data in training OpenAI models has increased over time. The AUROC score for GPT-3.5 Turbo, based on a dataset from 2021, was 54 percent for non-public excerpts, while GPT-4o mini, released in 2024, achieved a score of 56 percent, suggesting these models were not trained using O'Reilly's works.
The authors of the study highlight a broader, systematic issue regarding the use of copyrighted materials in training language models. They advocate for greater transparency and a formal licensing framework for the content used in such training processes. The authors warn that without appropriate compensation, the availability of content suitable for training AI models could diminish significantly. Recently, the New York Times also filed a lawsuit against OpenAI, alleging copyright violations related to the training of its AI systems.
Section: Business
Section: Arts
Section: Politics
Section: Health Insurance
Section: News
Section: News
Section: News
Section: Arts
Section: News
Section: Arts
Both private Health Insurance in Germany and public insurance, is often complicated to navigate, not to mention expensive. As an expat, you are required to navigate this landscape within weeks of arriving, so check our FAQ on PKV. For our guide on resources and access to agents who can give you a competitive quote, try our PKV Cost comparison tool.
Germany is famous for its medical expertise and extensive number of hospitals and clinics. See this comprehensive directory of hospitals and clinics across the country, complete with links to their websites, addresses, contact info, and specializations/services.
Frisch mit dem Amadeus Austrian Music Award ausgezeichnet, meldet sich OSKA mit neuer Musik und neuen Tourdaten zurück. Ihr zweites Album ,,Refined Believer" erscheint am 20. Juni 2025 und zeigt sie persönlicher und facettenreicher denn je. Noch in diesem Jahr geht sie solo auf Tour, bevor sie...
No comments yet. Be the first to comment!