9 Top Text to Speech APIs

Text to Speech (TTS) services convert text into spoken word audio. The technology is useful for providing content accessibility to people with visual impairments, reading impairments such as dyslexia, speaking impairment, studying languages, playing video games, language translation, and other uses.

Developers wishing to enhance applications with TTS services can tap into APIs for help.

What is a Text to Speech API?

A Text to Speech Application Programming Interface, or API, enables users to connect to TTS services to add speech synthesis functions into their applications.

The easiest place to find these APIs is in the Text to Speech category on ProgrammbleWeb. There, developers can see dozens of resources including APIs, programming language SDKs, Source Code samples, and how-to guides. In this article, we highlight the nine favorite TTS APIs according to page views on the ProgrammableWeb website.

1. Twilio API

Twilio provides a hosted API and markup language (TwiML) for businesses to build voice and SMS communications applications. Twilio provides a Text To Speech (TTS) serviceTrack this API for converting text into a human-sounding voice for its messaging and voice Platform. The TwiML <say> verb enables three separate voice engines, including Amazon Polly for dozens of voice choices and different languages. Twilio's TTS Console enables developers to test different voices. The Documentation includes quickstart guides for .NET, Java, Node.js, PHP, Python, Ruby, iOS, and Android.


2. CloudPronouncer API

CloudPronouncer APITrack this API enables applications to convert text to speech, supporting 175 (and growing) standard and premium voices in 31 languages and variants. The service may be used by any device that can connect to Internet and send the POST requests to the API. It uses cloud infrastructure and Artificial Intelligence to solve the requests.

3. Oddcast Text to Speech API

Oddcast offers a suite of APIs for building rich media applications. The Oddcast Text to Speech API allows developers to integrate text to speech functionality into any web or mobile application. The API supports 20 language types, including emotive cues and special audio effects, and offers a Library of over 185 voices. It is compatible with dynamic web applications, supporting Flash & JavaScript. It also allows for admin reporting and profanity filtering to track usage.

4. IBM Speech to Text API

The IBM Speech to Text APITrack this API automatically transcribes English speech to text. Developers can use this API to add speech transcription capabilities to their applications. Speech recognition accuracy is highly dependent on the quality of input audio, and the service can only transcribe words that it knows. Thus, the conversion of speech to text may not be perfect. IBM Speech to Text is part of the Watson Developer Cloud.

5. ResponsiveVoice Text To Speech API

The ResponsiveVoice Text-To-Speech APITrack this API is a cross-platform, HTML5-based library that supports 51 languages. It is open-sourced for non-commercial and non-profit use. It includes speech synthesis and speech recognition with lifelike human digital voices and is designed to voice-enable websites and applications.

6. VoiceForge API

VoiceForge is an online speech service that allows users high quality TTS audio. With the VoiceForge API, developers can simply send text and voice type, and receive the audio-equivalent content.

7. Amazon Polly API

Amazon Polly provides tools required to build speech-enabled applications. Amazon Polly features deep learning technologies to synthesize human-like voice. Neural TTS by Amazon Polly supports Newscaster reading style for news narration. The Amazon Polly APITrack this API can be utilized to access voices in a variety of languages.


Video: Amazon Web Services

8. Google Text to Speech API

Google Cloud Text-to-Speech APITrack this API converts text input into audio data of human-like speech in more than 180 voices across more than 30 and variants. With the API, developers can create interactions with users that are aimed to feel more lifelike. This API uses RESTful calls although there is a GRPC version of the API also available.

9. Voicepods API

Voicepods provides automated human-like TTS services . The Voicepods APITrack this API returns JSON responses with voice and text-to-speech narration features. Voicepods includes 16 International voices including Dutch, French, German, Italian, Korean, Japanese, Turkish, Spanish, Hindi and English. The service is primarily used to create natural sounding podcasts from written text.

Convert any text to human-like speech with Voicepods. Video: Voicepods

Find more resources on the TTS category page, including 79 APIs, 73 SDKs, and 76 Source Code Samples.

Be sure to read the next Text-to-Speech article: Microsoft Adds Text Analytics for Health Preview to Azure Cognitive Services