The AI Update | July 27, 2023 – The Artificial Intelligence Blog

#HelloWorld. Copyright suits are as unrelenting as the summer heat, with no relief in the forecast. AI creators are working on voluntary commitments to watermark synthetic content. And meanwhile, is ChatGPT getting “stupider”? Lots to explore. Let’s stay smart together. (Subscribe to the mailing list to receive future issues).

Big names portend big lawsuits. Since ChatGPT’s public launch in November 2022, plaintiffs have filed eight major cases in federal court—mostly in tech-centric Northern California—accusing large language models and image generators of copyright infringement, Digital Millennium Copyright Act violations, unfair competition, statutory and common law privacy violations, and other assorted civil torts. (Fancy a summary spreadsheet? Drop us a line.)

Here comes another steak for the grill: This month, on CBS’ “Face the Nation,” IAC’s chairman Barry Diller previewed that “leading publishers” were constructing copyright cases against generative AI tech companies, viewing it as a lynchpin for arriving at a viable business model: “yes, we have to do it. It’s not antagonistic. It’s to stake a firm place in the ground to say that you cannot ingest our material without figuring out a business model for the future.” Semafor later reported that The New York Times, News Corp., and Axel Springer were all among this group of likely publishing company plaintiffs, worried about the loss of website traffic that would come from generative AI answers replacing search engine results and looking for “billions, not millions, from AI.”

Committing to watermarking synthetic content. Moving on from looming conflict to apparent conciliation, on July 21, executives from seven major tech players, including Google, OpenAI, and Anthropic, gathered at the White House to announce their voluntary undertaking of eight commitments to “manage the risks posed by AI.” As published, these commitments are understandably high-level, but it’s the fifth commitment that caught our eye—developing and deploying “robust provenance, watermarking, or both” to distinguish synthetic output generated by AI from human-created content.

In one sense, this commitment is not surprising or new. Back in May, on Capitol Hill, OpenAI CEO Sam Altman readily agreed that “people need to know if they are talking to an AI.” Interestingly, however, July’s restated commitment is more limited in scope. It covers only “audio or visual content”—not the text that is ChatGPT’s and other LLM tools’ primary user-facing output. We chalk this up to the technology for watermarking strings of text lagging behind the tools for watermarking digital images and videos, which predate the current generative AI frenzy. Pixel-level manipulation schemes go back to the DRM (digital rights management) days of the late 1990s and early 2000s.

In the meantime, it’s fascinating to follow development of new digital watermarking tools. One such effort led by Adobe and involving members like Sony, Shutterstock, and the BBC is the Coalition for Content Provenance and Authenticity (C2PA). The C2PA has published technical specifications “for certifying the source and history (or provenance) of media content.” At a really high level, these standards would have each piece of synthetic content cryptographically signed with credentials issued by a “certification authority,” in a manner analogous to the way “trusted” websites today are issued certificates. Companies like OpenAI and Google have not yet detailed the specific digital watermarking mechanisms they’ll employ, but we should know more in the coming months.

What we’re reading: Is something up with ChatGPT? A recent study out of Stanford and UC Berkeley found that recent versions of GPT-3.5 and GPT-4 showed decreases in the quality of their output when solving math problems, generating code, visual reasoning, and answering sensitive questions. Results were mixed, but certain decreases in quality were drastic—for example, GPT-4 went from 97.6% to 2.4% accuracy in identifying prime numbers. Without much visibility into how these models were updated over the three-month time span of the study, the reason for the apparent degradation is unclear. But it does seem to confirm users’ anecdotal experience.

What should we be following? Have suggestions for legal topics to cover in future editions? Please send them to AI-Update@duanemorris.com. We’d love to hear from you and continue the conversation.

Editor-in-Chief: Alex Goranin

Deputy Editors: Matt Mousley and Tyler Marandola

Subscribe to the mailing list to receive future issues.