Jun 5, 2024

Pioneering Galician Text-to-Speech


We are excited to share the recent success of our collaboration with the Universidade de Santiago de Compostela (USC) at the 16th International Conference on Computational Processing of Portuguese (PROPOR 2024), which took place in Santiago de Compostela in March 2024. Our demo paper Nós-TTS: a Web User Interface for Galician Text-to-Speech, was awarded Best Demo Paper, a recognition that underscores the innovative work and dedication of our team at Col·lectivaT and our partners at the Proxecto Nós at USC. This award reaffirms our commitment to creating accessible and open-source solutions for low-resource languages like Galician.

Proxecto Nós

The Nós Project is an ambitious initiative funded by the Galician Government and implemented by USC, aiming to elevate the Galician language through advanced language technologies. This project encompasses a wide range of subfields of NLP, including speech synthesis, speech recognition, dialogue systems, and machine translation. By developing openly licensed resources, tools, and demonstrators, the Nós Project strives to strengthen the position of Galician, ensuring it thrives in the digital age.

Col·lectivaT’s contribution to this project involved the creation of a state-of-the-art text-to-speech (TTS) voice and an application programming interface (API) to enable its integration.

What is Text-to-Speech (TTS)?

Text-to-Speech (TTS) technology converts written text into spoken words, allowing digital devices to communicate with users in a natural, human-like voice. High-quality TTS systems can produce synthetic speech with various speaker identities, styles, and emotions. TTS can enable and enhance user experience in applications like news reading, virtual assistants, and automated translators. Additionally, TTS technology is essential for making digital content accessible to people with visual impairments, those with reading difficulties, or individuals who prefer auditory learning.

As an example, the Catalan version of this article can be listened to thanks to our TTS system Catotron.

Technical developments by Col·lectivaT

At Col·lectivaT, our involvement in the Nós Project focused on the development of a state-of-the-art text-to-speech (TTS) system for Galician. Here are some key highlights of our contributions:

  • Development of the Sabela Voice: We created the Sabela TTS voice model, trained from scratch using a corpus provided by USC. This included 10,000 sentences recorded by a professional radio broadcaster, totaling approximately 14 hours of speech.

  • Phonological Model Integration: We incorporated a phonological model provided by USC, testing its impact on the naturalness and accuracy of the synthesized speech. This collaboration enabled us to refine our models and improve the quality of the speech output.

  • Demo Webpage and API Development: We developed a demo webpage and a ready-to-use Application Programming Interface (API), making our TTS system accessible to developers and end-users. The open-source code for the API is available on GitHub, together with links to Col·lectivaT’s TTS models for Catalan and Judeo-Spanish.

During this project, USC provided essential data and assisted in the evaluation of the models. Our collaborator, Carmen Magariños, played a pivotal role in ensuring the robustness and accuracy of our TTS system through comprehensive evaluations.

Official demo of Nós-TTS

The official demo of Nós-TTS, which includes the voices Celtia and Icía in addition to Sabela, is built upon the foundational work developed by our team. You can experience it yourself on the demo page.

Captura de pantalla de la Demostració oficial de Nós-TTS

Recognition at PROPOR 2024

We’re proud that our demo paper won the Best Demo Paper award at PROPOR 2024. This recognition showcases our commitment to creating accessible, open-source solutions for low-resource languages like Galician.

Language technology at Col·lectivaT

At Col·lectivaT, we specialize in a diverse range of language technologies, including machine translation, text-to-speech, and speech recognition. Our expertise in working with low-resource languages enables us to create impactful solutions that enhance digital accessibility and inclusion.

For more information about our technological portfolio, please visit our resources page. If you’re interested in partnering with us to develop cutting-edge language technologies that empower communities and bridge digital divides, please reach out to us at info@collectivat.cat. We look forward to hearing from you.