Google has announced updates to its Cloud Speech to TextTrack this API and Text to SpeechTrack this API APIs that include additional languages, voices, and more affordable pricing models. The goal of these updates is to help developers build better intelligent voice applications and make the products more widely available.
The Cloud Speech to Text API, available in both RESTful and gRPC flavors, is relied upon for creating voice applications where the accuracy of speech recognition is crucial, such as phone call and video transcription tools. The first update is that Google is now making premium AI models for video and enhanced phones that were previously in beta, generally available to all developers. According to the provider, the video model has been improved and now has 64 percent fewer transcription errors while the phone model has 62 percent fewer errors.
Google also announced that multi-channel recognition is now generally available. This tool helps the Speech to Text API differentiate between multiple audio channels such as during a phone conference or perhaps in a call center.
Google is trying to entice users to opt in to their data logging program by offering a discount of 33 percent off the standard and premium video model for transcribing videos. For those not comfortable taking part in the program, both models are still available and the premium video model is now offered at a 25 percent discount.
The biggest update for the Text to Speech API, also offered as REST or gRPC, is that it is now offering support for seven new languages bringing its total up to 21 supported languages. The API has also received 31 new WaveNet and 24 new standard voices. Lastly Google has made the Device Profiles feature generally available allowing customers to optimize the quality of audio playback across various types of hardware.