As Google positions its upgraded generative AI as teacher, assistant, and recommendation guru, the company is also trying to turn its models into a bad actor’s worst enemy.

“It’s clear that AI is already helping people,” said James Manyika, Google’s senior vice president of research, technology, and society, to the crowd at the company’s Google I/O 2024 conference. “Yet, as with any emerging technology, there are still risks, and new questions will arise as AI advances and its uses evolve.”

Manyika then announced the company’s latest evolution of red teaming, an industry standard testing process to find vulnerabilities in generative AI. Google’s new “AI-assisted red teaming” trains multiple AI agents to compete with each other to find potential threats. These trained models can then more accurately pinpoint what Google calls “adversarial prompting” and limit problematic outputs.

The process is the company’s new plan for building a more responsible, humanlike AI, but its also being sold as a way to address growing concerns about cyber security and misinformation.

The new safety measures incorporate feedback from a team of experts across tech, academia, and civil society, Google explained, as well as its seven principles of AI development: Being socially beneficial, avoiding bias, building and testing for safety, human accountability, privacy design, upholding scientific excellence, and public accessibility. Through these new testing efforts, and industry-wide commitments, Google’s attempting to putting product where its words are.


Leave a Reply