By Marcelina Horrillo Husillos
Spotting bias in AI helps us to spot our own, remove it from our societal contexts, and build a more inclusive GenAI tool should be the goal of an inclusive society.
Generative AI (GenAI) relies on transformers that are extremely complex and pre-trained, in a process involving billions of different parameters dealing with massive datasets. Furthermore, GenAI involves the participation of users, who are invited to input prompts to conversational agents that then elaborate based on those. As a consequence, monitoring biases involves not only ensuring that the datasets correctly reflect the population that will be using the AI agent, but also to take into account the way users will interact with it. A tool like ChatGPT serves about 100 million users a week, for a total of 1.6 billion users in 2024. These massive figures are increasing complexity and making fairness and bias even harder to assess.
In general, bias refers to the phenomenon that computer systems “systematically and unfairly discriminate against certain individuals or groups of individuals in favor of others”. In the context of LLMs, GenAI is considered biased if it exhibits systematic and unfair discrimination against certain population groups, particularly underrepresented population groups. AI bias, also called machine learning bias or algorithm bias, refers to the occurrence of biased results due to human biases that skew the original training data or AI algorithm, leading to distorted outputs and potentially harmful outcomes. Extensive research has shown the potential implications of bias in GenAI produced content which perpetuates societal biases based primarily on language, gender, ethnicity and stereotypes.
English speaking Bias
Because the internet is still predominantly English — 59 per cent of all websites were in English as of January 2023 — LLMs are primarily trained on English text. In addition, the vast majority of the English text online comes from users based in the United States, home to 300 million English speakers. Learning about the world from English texts written by U.S.-based web users, LLMs speak Standard American English and have a narrow western, North American, or even U.S.-centric, lens.
University of Chicago Asst. Prof. Sharese King and scholars from Stanford University and the Allen Institute for AI, found that AI models consistently assigned speakers of African American English to lower-prestige jobs and issued more convictions in hypothetical criminal cases—and more death penalties.
The latest result from a Cornell University pre-print study into the “covert racism” of large language models (LLM), a deep learning algorithm that’s used to summarise and predict human-sounding texts. Hence, the dialect of the language you speak decides what artificial intelligence (AI) will say about your character, your employability, and whether you are a criminal.
Although AI and deep learning offers us untapped possibilities, they can also lead to contemporary dystopia, where technology is used to erase individuals’ differences, identity markers and cultures; where dehumanisation overshadows priorities such as the common good or diversity, as spelt out in the UNESCO Universal Declaration on Cultural Diversity.
Experts advocate for the development of non-English Natural Language Processing (NLP) applications, to help reduce the language bias in generative AI and “preserve cultural heritage”. The latter is one of 30 suggested actions put forward in the World Economic Forum’s Presidio Recommendations on Responsible Generative AI. “Public and private sector should invest in creating curated datasets and developing language models for underrepresented languages, leveraging the expertise of local communities and researchers and making them available.
Gender Bias
A UNESCO study revealed tendencies in Large Language models (LLM) to produce gender bias, as well as homophobia and racial stereotyping. Women were described as working in domestic roles far more often than men – four times as often by one model – and were frequently associated with words like “home”, “family” and “children”, while male names were linked to “business”, “executive”, “salary”, and “career”.
Surely, the world has a gender equality problem, and Artificial Intelligence (AI) mirrors the gender bias in our society. A study by the Berkeley Haas Center for Equity, Gender and Leadership analysed 133 AI systems across different industries and found that about 44 per cent of them showed gender bias, and 25 per cent exhibited both gender and racial bias.
In regards to employment and according to most recent data, women represent only 20% of employees in technical roles in major machine learning companies, 12% of AI researchers and 6% of professional software developers. Gender disparity among authors who publish in the AI field is also evident. Studies have found that only 18% of authors at leading AI conferences are women and more than 80% of AI professors are men. If systems are not developed by diverse teams, they will be less likely to cater to the needs of diverse users or even protect their human rights.
According to the Global Gender Gap Report of 2023, there are only 30 per cent women currently working in AI. Removing gender bias in AI starts with prioritizing gender equality as a goal, as AI systems are conceptualized and built. This includes assessing data for misrepresentation, providing data that is representative of diverse gender and racial experiences, and reshaping the teams developing AI to make them more diverse and inclusive.
Racial Bias
Just like humans, artificial intelligence (AI) is capable of saying it isn’t racist, but then acting as if it were. According to a study published in Natur, LLMs associate speakers of Afro American English with less prestigious jobs, and in imagined courtroom scenarios are more likely to convict these speakers of crimes or sentence them to death.
Another clear example of how racial biases is predictive policing. Predictive policing tools make assessments about who will commit future crimes, and where any future crime may occur, based on location and personal data. As UN Special Rapporteur, Ashwini K.P.* states: “Predictive policing can exacerbate the historical over policing of communities along racial and ethnic lines,” Ms K.P. said. “Because law enforcement officials have historically focused their attention on such neighbourhoods, members of communities in those neighbourhoods are overrepresented in police records. This, in turn, has an impact on where algorithms predict that future crime will occur, leading to increased police deployment in the areas in question.”
Bloomberg found that images from Stable Diffusion associated with higher-paying job titles featured people with lighter skin tones, and that results for most professional roles were male-dominated. Text-to-image AI such as Stable Diffusion generates images using artificial intelligence, in response to written prompts. Like many AI models, what it creates may seem plausible on its face but is actually a distortion of reality. In an analysis of more than 5,000 AI images,
Some experts in generative AI predict that as much as 90% of content on the internet could be artificially generated within a few years. As these tools proliferate, the biases they reflect aren’t just further perpetuating stereotypes that threaten to stall progress toward greater equality in representation, they could also result in unfair treatment. Take policing, for example. Using biased text-to-image AI to create sketches of suspected offenders could lead to wrongful convictions.
“Artificial intelligence technology should be grounded in international human rights law standards,” Ashwini K.P. said. “The most comprehensive prohibition of racial discrimination can be found in the International Convention on the Elimination of All Forms of Racial Discrimination.” In her report, she explores how this assumption is allowing artificial intelligence to perpetuate racial discrimination.
Stereotypical Bias
Stereotypes, which are the generalizations about a group, or individual based on assumptions made by an observer, can create harmful social biases. For instance, the marketing and advertising industries have in recent years made strides in how they represent different groups, although they now show greater diversity in terms of race and gender, and better represent people with disabilities, there is still much progress to me made.
“We are essentially projecting a single worldview out into the world, instead of representing diverse kinds of cultures or visual identities,” said Sasha Luccioni, a research scientist at AI startup Hugging Face who co-authored a study of bias in text-to-image generative AI models.
Researchers admit that stereotypes reproduced by GenAI could cause real harm. Image generators are being used for diverse applications, including in the advertising and creative industries, and even in tools designed to make forensic sketches of crime suspects.
In a recent podcast, OpenAI CEO Sam Altman said, “The bias I’m most nervous about is the bias of the human feedback raters.” When asked, “Is there something to be said about the employees of a company affecting the bias of the system?” Altman responded by saying, “One hundred percent.”
In 2020 members of the OpenAI team published an academic paper that states their language model is the largest ever created, with 175 billion parameters behind its functionality. Having such a large language model should mean ChatGPT can talk about anything. However, unfortunately, as a model this size needs inputs from people across the globe, but inherently will reflect the biases of their writers, the contributions of women, children, and other people marginalized throughout the course of human history will be underrepresented, and this bias will be reflected in ChatGPT’s functionality.
Conclusion
As the saying goes, “you are what you eat”, and in the case of generative AI, these programs process vast amounts of data and amplify the patterns present in that information. The question of whether GenAI is biased is based on the fear that algorithms built mostly by men using datasets that represent only a fraction of humanity could be biased against women, non-Western cultures or minorities.
However algorithmic bias can also be used to reduce human bias, as these mirror them. Algorithms can reveal hidden structural biases in organizations and individuals. A paper published in the Proceedings of the National Academy of Science, found that algorithmic bias can help people better recognize and correct biases in themselves.
People see more of their biases in algorithms because the algorithms remove people’s bias blind spots, it is easier to see biases in others’ decisions than in your own because you use different evidence to evaluate them. When examining your decisions for bias, you search for evidence of conscious bias. You overlook and excuse bias in your decisions because you lack access to the associative machinery that drives your intuitive judgments, where bias often plays out. But algorithms remove the bias blind spot because you see algorithms more like you see other people than yourself.
AI, mirrors our society, its strengths, biases and limitations. As we develop this technology, society needs to be mindful of its technical capabilities and its impact on people and cultures. Looking ahead, the conversation around AI and bias should continue to grow, incorporating more diverse perspectives and ideas. Also, should evolve our commitment to making AI more inclusive and representative of the diverse world we live in.
“Every part of the process in which a human can be biased, AI can also be biased.” Nicole Napolitano, Center for Policing Equity.