The AI Update | October 25, 2023 – The Artificial Intelligence Blog

#HelloWorld. October swirls with AI headlines, but one senses a running-to-stand-still quality. Like the opening moves in a chess game, players continue to arrange regulatory and litigation pieces on the board, but the first true clash still awaits. Let’s stay smart together. (Subscribe to the mailing list to receive future issues.)

Two more federal copyright lawsuits. Authors and artists deploy more pieces in the burgeoning generative AI litigation wars. On October 17, yet another group of authors, including former Arkansas governor Mike Huckabee, filed yet another copyright class action, in the Southern District of New York, targeting large language models (LLMs) trained in part on the controversial text datasets “Books3” and “The Pile.”

One day later, music publishers Universal Music Group, Concord Music Group, and ABKCO struck in music capital Nashville. This copyright case focuses on Claude, a ChatGPT-like chatbot from leading AI model developer Anthropic. The publishers claim that Claude was trained on copyrighted song lyrics and can now be induced to output literal and derivative copies of them (example: “I got the eye of the tiger, a fighter, dancing through the fire,” from Katy Perry’s “Roar”). The publishers claim this situation threatens existing licensing markets in which “Publishers license their copyrighted lyrics” to “music lyrics aggregators and websites” thereby “ensuring that the creators of musical compositions are compensated and credited for such uses.” The generative AI battlefronts now span the country, from Northern California through Tennessee to New York.

Will EU AI Act negotiations continue into 2024? We’ve been tracking developments with the EU AI Act since May. Conventional wisdom was that the European Parliament, the Council of the European Union, and the European Commission would complete their “trilogues” and arrive at a final version of the AI Act by the end of 2023. Now that timeline seems under pressure. Reuters reports that negotiations could be pushed into “early next year” and “could then be further de-railed by the European parliament elections in June.” Apparently, EU policymakers continue to disagree over how extensively to regulate the most powerful LLMs and generative AI models, like GPT-4, and how to determine which AI systems will count as the “very capable foundation models” or “high impact models” to be subjected to these more stringent requirements. Nonetheless, Dragos Tudorache, the most prominent of the European Parliament members working on the Act, sounds an optimistic note, saying in an interview yesterday that the EU is within “touching distance” of adopting the Act with a “good 60-70% of the text…already agreed.”

Google promises to indemnify some users. While litigation brews and regulation remains on the horizon, the big technology players continue to publish private-ordered solutions. Google is the latest, posting on October 12 a two-prong approach to indemnifying Google customers against copyright infringement claims:

- A “training data” indemnity: Any customer of any Google generative AI model in any Google service is protected against a third-party claim alleging that the model used copyrighted material as part of its neural-network training.
- A “generated output” indemnity: A customer of any of seven Google products (for now) is further indemnified against any claim that the synthesized output of a Google generative AI model violates copyright—but only if the customer “didn’t try to intentionally create or use” the infringing output. The seven Google products covered: Duet AI in Workspace, Duet AI in Google Cloud, Vertex AI Search, Vertex AI Conversation, Vertex AI Text Embedding API / Multimodal Embeddings, Visual Captioning/Visual Q&A on Vertex AI, and Codey APIs.

What we’re reading: How much are large AI developers like OpenAI, Stability, Google, Anthropic, and Cohere disclosing to the public about their models? Researchers from Stanford, MIT, and Princeton have tried to answer that question systematically by creating and publishing a new “Foundation Model Transparency Index.” The index assesses whether 10 prominent developers provided public information across 100 indicators, widely ranging from technical concepts (model size and architecture), to data governance (data sources, data licenses), to policy and legal concerns (energy usage, carbon emissions, privacy and copyright risk mitigation). The average developer scored only a 37%, that is, made a “satisfactory” public disclosure on only 37 of the 100 indicators analyzed. Room for improvement or prudent protection of proprietary information? We’ll let you judge for yourselves.

What should we be following? Have suggestions for legal topics to cover in future editions? Please send them to AI-Update@duanemorris.com. We’d love to hear from you and continue the conversation.

Editor-in-Chief: Alex Goranin

Deputy Editors: Matt Mousley and Tyler Marandola

If you were forwarded this newsletter, subscribe to the mailing list to receive future issues.