From asking Siri a question to using Alexa at home, voice-to-text shortcuts have become a part of many people’s routines. But research shows the technology doesn’t work equally well for everyone, especially users with certain accents and dialects.
Google found that Black technology users in the United States experience more errors with automatic speech recognition (ASR) technology compared to white users.
To address this gap, Google has collaborated with Howard University to create Project Elevate Black Voices (EBV), an initiative to create a high-quality dataset of African American English speech. The goal is to improve ASR technology for Black users.
“Black communities have been historically harmed and also historically excluded, and technology is a major thing where just across the globe, the digital divide is really apparent within Black communities,” said Dr. Lucretia Williams, a senior research scientist in the College of Engineering and Architecture.
Dr. Gloria Washington, associate professor at Howard, is the co-principal investigator for Project EBV. She said the initiative was introduced almost five years ago with Dr. Courtney Heldreth, lead for Google’s Responsible AI Human-Centered Technology UX team, who first identified the problem. Washington emphasized the importance of protecting the data and using a community-informed approach.
She and her team partnered with other HBCUs and Black churches, traveling across the country, from California to the Deep South, to collect more than 600 hours of voice samples.
“We had this focus in mind with capturing all of those unique voices that can be used to help automatic speech recognition. We didn’t get everything, but we started to try to capture it, protect it and safeguard it,” Washington said.
Williams emphasized how widely ASR is used, not just by Black communities but by everyone. She said reducing the error rate could have significant benefits.
“It’s imperative in certain situations where you would need to use voice technology to call for help, and it would be beneficial, but if these systems are not recognizing your voice because you have a thicker accent that you can’t code-switch with, that’s a huge challenge — a life or death situation,” Williams said.
The team is now working to establish a Consortium for African American English Dialect to gather more datasets, including states with smaller Black populations such as Arizona.
With an emphasis on data protection, Washington also noted that the dataset is being used to study deepfake technology and help protect individuals from harm by distinguishing between real and fake voices.
She also raised concerns about AI’s impact on Black communities, pointing to Memphis, Tennessee, where data centers have been linked to environmental issues. In Memphis, Elon Musk built an AI supercomputer the size of 13 football fields powered by 35 methane gas turbine generators (enough to power an entire city) with no legal permission, according to The Guardian.
Washington assured that EBV is not tied to AI data centers and that the dataset is stored on campus servers. She stressed that Howard, not Google, owns the dataset. The university decides which companies can access the information to ensure it is not misused.
“Our data is controlled, and whoever gets access to the dataset from an industry perspective needs an understanding of wanting to also protect the community,” Washington said. “We want to capture these guidelines around the use of the data, and we’re having that discussion with large companies now.”
While only Google is currently using the dataset, Washington said Howard is exploring licensing opportunities to generate funds and add safeguards. As the project continues and more data is collected, Washington said the goal is to ensure the dataset has a lasting positive impact on the Black community.
Josiah Smith, a sophomore broadcast journalism major from Shreveport, Louisiana, shared how his experiences with speech recognition models often left him frustrated; the software frequently failed to accurately capture his words.
“I feel like the technology is probably built to understand or recognize more ‘proper’ English, and so because a lot of Black people speak in different accents or tones — that’s a big reason why they get misconstrued,” Smith said.
Smith’s comments highlight a broader issue: the exclusion of African American English (AAE) from speech recognition and other technology. A 2024 study conducted by Sharese King, University of Chicago Assistant Professor of Linguistics, with students from Stanford University and the Allen Institute for AI, found that AI models often linked AAE speakers with negative attributes like “lazy” and “stupid,” while also assigning them less prestigious jobs and harsher legal outcomes than speakers of standardized English.
Chase Howard, a freshman political science major on the pre-law track from Charlotte, North Carolina, spoke about the importance of technological advancements like Project Elevate Black Voices (EBV) in representing the African American community.
“Technology has been a place Black people have been barred from forever. With the way Silicon Valley is set up, the boards, the task forces, at places like Google, these places don’t often reflect us,” Howard said.
Howard’s comments reflect a wider concern among students: technology companies, often led by white CEOs and decision-makers, lack meaningful representation of Black voices in innovation spaces.
He also raised questions about accountability, emphasizing that students themselves play a role in ensuring projects like EBV lead to enduring outcomes for students.
“I think the next part is up to us,” Howard said.“How are we putting pressure on these organizations to ensure there is actually something tangible for students that comes out of this study? How are we getting Howard students in these rooms interning so they have more outcomes and opportunities to work on these things?
Howard posed the concern that projects like EBV must go beyond data collection and create long-lasting opportunities. For Howard, the measure of success is not only whether speech recognition models become more accurate, but whether partnerships with institutions like Howard create pathways for Black students to influence the future of technology from within.
“If you want to make sure African Americans are heard, you have to put young people who go to these universities and are in these communities in the room on these projects,” Howard said.
Copy Edited by D’Nyah Jefferson – Philmore