Making AI safe and fair

How ISI researchers are engaging in research, policy and thought leadership to guide AI toward positive futures.

by Stephanie Lee

April 22, 2024

History suggests that progress in technological safety is often preceded by catastrophe. The Chernobyl nuclear power plant accident, in 1986, led to sweeping international changes in nuclear safety standards. The Challenger space shuttle, that same year, fatally broke apart 73 seconds into its flight, spawning an overhaul of NASA’s decision-making processes and safety protocols. A disaster on this scale has not yet been caused by artificial intelligence. But as a technology more powerful and interconnected than systems past, it poses risks unlike any other.

“There are many capabilities in AI that we don’t understand well yet,” said Yolanda Gil, senior director for Strategic Initiatives in Artificial Intelligence and Data Science. “We don’t know how to measure them. We don’t know how to use them properly.” AI systems may cause unintended harms, have unfair biases, or behave in unpredictable ways. Safety challenges grow as AI becomes more autonomous, complex, and embedded within critical applications. Such speculative scenarios beg the question: what can we do to ensure that our AI systems cause no harm?

Building AI safety from the ground up

This concern lies at the core of an emerging research discipline: AI safety. Its scope is yet to be defined. Some experts believe AI safety should prioritize direct threats to human life or critical resources, such as infrastructure or the environment. Others interpret AI safety more broadly, taking into account the way AI could risk our social stability, the political process, mental health, and so on. Around the world, scientists, policymakers, and industry leaders are beginning to discuss how to improve safety measures to match capabilities.

“The point is,” said Adam Russell, director of ISI’s Artificial Intelligence division, “we have to be proactive now.” He recommends an “all hands on deck” approach where AI safety will require partnerships between academia, industry, and government. This collaboration must also be a global endeavor— even in the face of economic competition and differing national security interests.

In November 2023, one year after the explosive launch of ChatGPT, Russell participated in the inaugural AI Safety Summit in the United Kingdom. “Attendees included representatives from 27 countries, among them U.S. Vice President Kamala Harris and U.K. Prime Minister Rishi Sunak, and heads of influential tech companies, including Elon Musk and OpenAI CEO Sam Altman. This high-profile gathering began to scratch the surface of an epochal question: how AI safety and regulation should take shape.

Two main camps of thinking emerged, observed Russell, who is an anthropologist by training. One believed that we needed to slow down AI engineering to avoid existential risks. The other believed that we needed to speed up the development of the science of AI safety. No consensus surfaced. But for Russell, the takeaway was science. “Rather than slowing the engineering down, which is probably impossible, we need to accelerate AI safety as a science,” he said. This means proactively developing AI safety into a mature science now, rather than operating on typical scientific timelines or waiting to react to harmful situations. Though the road is long, planning for AI safety is beginning.

In January 2024, the U.S. established the U.S. Artificial Intelligence Safety Institute (USAISI), headquartered at the National Institute of Standards and Technology (NIST). As a leader in AI research for decades, USC has signed to be a founding member of the consortium to provide technical expertise on creating frameworks for safe and trustworthy AI. “When we talk about AI safety, it’s a very complex endeavor,” said Gil. “There’s many aspects, many technologies, and many applications, in many sectors. We have the challenge of a generation ahead of us.”

Creating responsible systems across sectors

At ISI, scientists across a range of disciplines and expertise are chipping away at this challenge. Some are working on new paradigms, such as Alexander Titus, an AI and life science specialist. His proposed framework, called “violet teaming,” argues for the inclusion of diverse stakeholders—for instance, social scientists, physicians, patients—to improve ethical and social oversight of the AI development process.

Others are developing applications, such as Mohamed Hussein and Wael Abd-Almageed, both computer vision experts. The pair founded a visual intelligence and multimedia forensics lab at ISI that develops AI safety technologies, from improving security to unmasking fake news and deepfakes. Recently, the lab also published a new method to detect malicious Trojan attacks in pre-trained open-source AI models, which provide the starting point for many applications. Yolanda Gil’s focus is inspired by the rigorous discipline of safety engineering. Their research team works with the USC Aviation Safety and Security Program to use AI to improve aviation safety—while also exploring how the methodologies of safety engineering could be applied to AI. Although each of these projects varies widely in topic, they are united by a common thread: steering AI toward social good.

Keep AI fair to prevent harm

Social good is also what inspires the dozens of ISI researchers working on AI bias and fairness, a topic some consider to be part of the larger conversation on AI safety. The goal of this discipline is to measure and augment implicit biases within algorithms and data, thereby making systems more equitable and just.

While bias may not be a direct threat to life, the scale of harm grows as AI systems are increasingly embedded into the infrastructure we rely on, from healthcare and finance to our cars. “AI has become such a foundation for so many different companies and things we rely on in our life,” said Keith Burghardt, an ISI computer scientist who teaches a Viterbi course entitled Fairness in Artificial Intelligence. Already, biased AI-powered systems have demonstrated unfair behavior, such as incorrectly predicting a higher likelihood of criminality among Black individuals, facilitating the biometric tracking of ethnic minorities by governments, and contributing to delayed home loan approvals for minorities.

Last year, in contrast, multiple ISI researchers demonstrated progress in making AI systems fairer. Katy Felkner, a Ph.D. student in Computer Science at ISI and her advisor and co-author Jonathan May, Principal Scientist and Research Associate Professor at ISI, created a benchmark dataset, called WinoQueer, aimed at measuring biases against the queer and trans community within existing large language models (LLMs).

Abel Salinas, a Ph.D. student working with advisor and co-author Fred Morstatter, also measured implicit bias in LLMs, this time through the lens of job seeking, gender, and nationality. His study found that models including ChatGPT discriminated against various demographic identities; for instance, by recommending low-paying jobs to Mexican workers and secretarial roles to women. “As we are deploying these systems at scale, first we need to make sure that they are safe,” Salinas said.

A real-time tech revolution

Thinking further into the future, Emilio Ferrara, an ISI Research Team Leader and USC professor, spent 2023 exploring AI risks through a conceptual lens, for example, thinking about the Butterfly Effect in AI systems. Like its origin in chaos theory, the Butterfly Effect expresses the notion that even small changes can lead to significant and often unpredictable consequences within complex systems. In the context of AI, this could lead to systems that behave in ways we don’t expect, amplifying inherent biases within data or algorithms.

“The idea is best portrayed by the popular saying that the flap of a butterfly’s wings in Brazil could set off a chain of events leading to a tornado in Texas,” Ferrara wrote.

Ferrara’s conceptual work is a response to living through a technological revolution in real-time. On the other side, humans will not be the only actors at the center of society—there will also be machines.

Ferrara’s goal is to pave the way for others to start thinking about problems that may emerge in this new world. More importantly, by raising awareness around potential risks, action can be taken before it’s too late. Yet as overwhelming as the change of pace is, Ferrara believes time remains to put AI on the right course. “We’re still at the inception of this technology,” he said. “Researchers, corporations, and governments have plenty of agency to anticipate risks, change the way we do things, and design systems that are safe.”

Published on April 22nd, 2024

Last updated on April 16th, 2024