Bias in Artificial Intelligence
By Susan Carr | Senior Writer, ImproveDx
...Inequity and bias are not to be found in a single place, like a bug that can be located and fixed. These issues are systemic 1(p9)
Recent news stories deliver apparently contradictory messages about artificial intelligence (AI): future prospects leading to lucrative business deals on the one hand and disappointing performance prompting protests and lawsuits on the other.2-4 Expectations remain high that AI will continue to transform many aspects of daily life, including health care — thus the business deals, such as Microsoft’s commitment to acquire Nuance Communications for $19.7 billion.5 At the same time, AI algorithms have produced results biased against women and people of color, prompting disillusionment and reassessment.
Racism and Gender Inequities
Troubling examples demonstrate how AI can reflect and even exacerbate existing racism and gender inequities. In 2015, Black software developer Jacky Alciné was shocked to find that Google Photos had automatically sorted selfies he’d taken with a friend, who is also Black, into a default folder labeled “gorillas.”6,7 And in 2017, Joy Buolamwini, a Ghanaian-American computer scientist at the MIT Media Lab could only get a facial recognition program to see her by putting on a white mask.8,9 Other examples abound.
The source of algorithmic bias is more implicit than explicit. It is inherent in the environment within which the technology was developed. Both of the above examples involve facial recognition, but the underlying problem of faulty, biased, and offensive results occurs across many algorithms and applications of AI.
Machine learning, the process through which algorithms learn to identify patterns in data, relies on vast digital archives of tagged images, records, and other data for continual training and refinement. Most of the online information currently available in the quantities AI craves is biased toward white males, with people of other races, sex, and gender identification greatly underrepresented.
The bias toward white males is persistent and systemic, inherent in the training data and carried forward in algorithms that can accentuate and intensify the bias. Buolamwini warns, “Algorithms, like viruses, can spread bias on a massive scale, at a rapid pace.”10(np)
Algorithms exert this power because they are embedded in systems everywhere, often generating influential results without revealing how they arrive at their conclusions.9 In his book on artificial intelligence in medicine,11 Eric Topol, MD, casts modern-day algorithms as agents, operating in the world according to human-written instructions, with growing independence:
Algorithms have thus become agents…. Algorithms now do things.… They drive cars. They manufacture goods. They decide whether a client is creditworthy.12
They also help diagnose diseases. Some industries look forward to the day when algorithms act autonomously — in self-driving automobiles, for example. Despite futuristic projections about physicians being replaced by algorithms, in healthcare AI remains an assistive technology, with growing influence but no real prospect for independent decision-making in the foreseeable future.11,13
Another example of bias in healthcare demonstrates how past inequities can play forward in these models. While analyzing hospital data to measure the impact of a managed care program, researchers were surprised to find that among patients with comorbidities, Black patients were on average assigned lower risk scores than white patients who appeared to have similar medical conditions.14 And those lower scores meant the Black patients would receive less personalized care.
Digging deeper, researchers discovered that the algorithm assigned risk scores based on the patient’s annual cost of care. That is to say, the developers used cost of care as a proxy for complexity of medical condition. But Black patients often receive less care for a variety of reasons related to systemic racism: poor access to care, inadequate health insurance, lack of respect, distrust of the health care system, and other barriers to receiving care. It turned out that Black patients who had similar health care costs as white patients had more comorbid conditions and were sicker.
Hospitals and insurers in the U.S. use the algorithm involved in this study to manage care for approximately 200 million people annually. After this story became known through an article in Science, U.S. Senators Cory Booker (D-NJ) and Ron Wydon (D-OR) urged the Centers for Medicare & Medicaid Services, the Federal Trade Commission, and other leading health care payers and regulators to address the problem of bias in health-care AI.15 The algorithm’s developer is now working with the research team to correct the problem.
Interviewed when the study was published in 2019, lead author Ziad Obermeyer, MD, reflected on the challenge of undoing systemic bias in AI:
Those solutions are easy in a software-engineering sense: you just rerun the algorithm with another variable.… But the hard part is: what is that other variable? How do you work around the bias and injustice that is inherent in that society? 16(608)
If the bias embedded in these systems and the resulting harm were simply caused by glitches or homogeneity in the training data or through a lack of awareness among computer scientists, it would be easier to fix. Fundamentally, the problem reflects long-standing prejudice and discrepancies of power throughout society, including the corporations that develop and benefit from AI services and products, the educational institutions that train scientists, the research community, public and private investors, and so on.1 As with other issues that stem from systemic prejudice related to race, ethnicity, sex, and gender, awareness is the first step in a long journey toward inclusion, equity, and fairness.
AI Bias and Diagnosis
In medicine, specialties that rely on processing visual information and pattern recognition skills — dermatology, radiology, ophthalmology, and pathology — are among early adopters of AI systems designed to assist with diagnosis.
Because skin color affects the presentation of conditions and diseases, dermatology’s experience with AI includes working with racial differences and the potential for bias.17,18
VisualDx, provider of diagnostic clinical decision support to advance pattern recognition in medicine and dermatology, has been aggregating images of disease presentations in people of color for over 20 years. The images are classified by diagnosis and by Fitzpatrick Skin Type, a standard phototype categorization used often to define pigmentation. VisualDx’s CEO Art Papier, MD, explains, “Erythema (skin redness) and purpura (a sign of blood leakage out of blood vessels), for example, are physical exam clues relatively easy to see on white skin, but they offer a different challenge on darker skin tones.” He adds, “Machine learning is completely dependent on training algorithms on excellent data. In dermatology, as in all the visual specialties, the effectiveness of machine learning requires highly reliable ‘ground truth’ for training” (written communication, April 2021).
In radiology, recent research shows the effect that biased training images can have. Researchers in Argentina evaluated the impact of gender imbalance in imaging archives used to train AI-based systems for computer-aided diagnosis (CAD).19 They found reduced diagnostic performance for female patients when the CAD system had been trained on images from male patients. Running a number of different scenarios for gender balance—0% women/100% men, 25%/75%, 50%/50%—they consistently found that imbalance in the training data negatively impacted the accuracy of results. The best performance for both male and female patients came from a system trained with data balanced 50/50 for gender.
In 2019, a consortium of organizations, including the American College of Radiology, RSNA, American Association of Physicists in Medicine, and imaging societies in Canada and Europe, developed a joint statement to address bias and other ethical issues in the growing use of AI in radiology. The statement proposes a set of 8 questions that those responsible for AI systems should be able to answer, including:
- What kinds of bias may exist in the data used to train and test algorithms?
- What have we done to evaluate how data are biased, and how it may affect our model?
- What are the possible risks that might arise from biases in the data? 20(p438)
These and the other questions proposed are crucial for the ethical use of these systems, but the ability to supply answers and guidance for future practice is still a work in progress.
AI supplies systems that are sophisticated, complex, and often inscrutable even to the computer scientists who develop them. Describing Jacky Alciné’s experience with Google Photos, which Google was able to fix only by removing “gorilla” as a category, computer scientist Erik Larson comments:
You have these terrible, insensitive results from the system, and the system just doesn’t know what it’s doing. We don’t know what it was focusing on or how it made that decision. These systems are notoriously opaque…you can’t deconstruct them after the fact.21(31:31 mins)
Beyond the outrage prompted by results that are obviously insulting or subtly discriminatory, AI results that are erroneous and unfair undermine the public’s trust. Given general agreement that AI will ultimately improve the quality and safety of health care,22 maintaining the public’s trust is another reason to address problems with bias.
There are many efforts underway to deal with the problem. Some are calling for AI ethics to be included in medical school education.23 Others suggest that trying to fix bias within AI technology is a “seductive diversion”24 from dealing with questions of racism, corporate power, and data ownership. And there is a movement to create “explainable” artificial intelligence to design transparency and accountability into these systems.
In recent remarks, Micki Tripathi, Ph.D., President Biden’s National Coordinator for Health IT, included AI bias among the agency’s top priorities. Acknowledging a history of policies that have been advanced without understanding the possible effects across all populations, he expressed a desire to take health equity into account upfront. Forecasting that this will continue to be an issue of growing concern, Tripathi said,
I think everyone's familiar with the issues with algorithmic bias. And that is a bigger and bigger issue the more that algorithms are embedded in…all the technologies that we use.25
The pandemic has focused attention on systemic racial injustice and health care disparities. And it has accelerated the adoption of remote technologies often enhanced with artificial intelligence. Although the challenges are formidable, there is a growing movement to address the problem of bias in AI and be able to harness its potential to improve diagnosis and other aspects of health care without causing further societal division and harm.
Thank you to our reviewers: Jen J. Gong, Ph.D.; Art Papier, MD; Lorri Zipperer, MA
In this Issue:
Get ImproveDx in Your Inbox
Don't miss an article. Get the Society to Improve Diagnosis (SIDM)'s newsletter delivered to your inbox.