Story in Brief:
- An Ashesi Computer Science faculty and volunteer team – led by Dennis Asamoah Owusu ’12 and David Sampah – has published new tools to help software developers build applications in Twi (Asante, Akuapim, Fante) and Ga, two of Ghana’s most dominant languages.
- This release is part of a greater natural language processing research effort at Ashesi, focusing on African languages. Read more about the research effort here.
Dennis Owusu Asamoah (left) and David Sampah (right)
Go Further
The team has been working on the project for several years as part of their broader effort to build language datasets for technology development in Ghana. Other Ashesi faculty researchers on the project were Dr. Ayorkor Korsah, Benedict Quartey ’18, Stephane Nwolley Jnr., David Adjepon-Yamoah, and Lily Omane Boateng. The immediate use case for the language dataset is financial services. However, the bulk of the data can be additionally helpful for software and artificial intelligence training purposes.
The Lacuna Fund, which supported this project, is a collaborative funding effort with co-founders including the Rockefeller Foundation, Google.org, Canada’s International Development Research Centre, and GIZ’s FAIR Forward programme. The Fund seeks to provide data scientists, researchers, and social entrepreneurs with the resources needed to address an underserved population or problem, augment existing datasets to be more representative, or update old datasets to be more sustainable.
In 2020 Dennis’ start-up, Nokwary Technologies, won the Ecobank Fintech Challenge for developing an AI-powered banking solution for non-English speakers. His solution, selected out of over 600 submissions, allows financial transactions through popular apps like Whatsapp. By publishing this new dataset, Dennis and David hope to democratise the language resources Nokwary Technologies used and inspire the open-sourcing of other African language tools.
According to the UN, a language disappears every two weeks, together with its cultural and intellectual heritage. Introducing more languages into the digital world and in the public domain increases access to digital tools and education and strengthens linguistic diversity.
“It’s a challenge for some Africans to interact with technology,” shared Dennis. “Most technologies are built around English, even in basic forms. With this dataset being available to the public, we hope to help developers build more applications for Africans, English-speaking or not.”