As the discussion surrounding generative AI and its need for regulations intensifies, it is crucial to analyze the potential risks of these models that could perpetuate old biases and barriers to gender equality.
In a 2021 paper titled “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?,” Google ethics researchers, including Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell, outlined the risks of large language models (LLMs).
However, both Gebru and Mitchell were forced out from leading the company’s Ethical AI team by Google as a result of their work. In contrast, Geoffrey Hinton, a male researcher, appears to be the only one given the opportunity to normalize discourse around AI risks.
Languages models such as GPT-3, which was initially released in June 2020, mine text from sources that fail to include the voices of demographics such as women, older individuals, and marginalized groups. This results in LLMs’ decision-making, leading to inevitable biases that affect system workings. These models are likely to assimilate worldviews that belong to dominant groups from training data as ‘stochastic parrots.’ Thus, algorithms trained on datasets that do< bold> not< /bold> represent all people are liable to embed harmful stereotypes against women and minorities.
For instance, lenders utilize mortgage approval algorithms that are 40% to 80% more likely to deny borrowers of color than their white counterparts. These algorithms were used because they relied on who had received mortgages in the past, and in the US, past racial discriminations have been embedded in the dataset.
The training data sets of these models are so vast that it becomes difficult to audit them to check for biases. The ethics researchers noted that “A methodology that uses datasets too large to document is, therefore, inherently risky.” Nuances and language usage are critical for social change promotion, which movements like MeToo and Black Lives Matter show. AI trained on datasets scraped from the vast database of the internet that fails to incorporate the anti-sexist and anti-racist vocabulary that movements have worked so hard to develop is disastrous.
On a positive note, algorithmic accountability has increasingly been included in proposed regulations and frameworks. In the EU, a debate is ongoing surrounding whether or not the original developers of models like ChatGPT, classified as general-purpose AI models with multiple intended and unintended purposes, should be subject to the upcoming AI Act.
Women continue to lag behind men in digital and AI skill development, with more than twice as many young men as women aged 16 to 24 in OECD countries fully capable of programming, an essential skill for AI development. On the other hand, female AI talent has grown faster than that of male AI talent, and more and more women are joining AI research and development.
However, AI research is still primarily male-dominated. In 2022, only one in every four researchers publishing globally on AI worldwide was a woman. The issue of the under-representation of women not only limits their economic potential when employers hire AI talent across various sectors, but it also results in less representation in AI publications worldwide and the public debate surrounding AI risks and potential.
As everyone continues speaking about AI risk management, it is critical to address AI’s representation, bias, and discrimination issues adequately. Failure to do so puts female empowerment and progress at risk.
This AI research presents one concerning issue that needs immediate intervention. While generative AI receives robust research and development by researchers, it is essential to ensure that it has built-in guardrails to prevent historical biases from being encoded in its programs.
See more updates on AI-driven technologies and relevant futuristic endeavors by visiting GPT News Room.