Just about every developer knows that speech recognition will play a bigger role in their applications in the near future. The challenge has been finding a way to make that happen given the complexity associated with building speech-enabled applications.
To address that issue, Speaktoit is releasing api.ai, a free API that makes it possible for developers to add speech recognition, natural language understanding and text-to-speech capabilities to their applications. The Speaktoit API is based on a natural language platform that Speaktoit created to drive Assistant, an application that Speaktoit says has processed more than 1 billion speech-related requests.
When it comes to voice recognition, Speaktoit CEO Ilya Gelfenbeyn says developers need to be aware that there are multiple flavors that are appropriate for different use cases. Rather than developing all the logic required to speech-enable their applications, Gelfenbeyn says api.ai allows developers to leverage a cloud service to make a broad range of speech capabilities available to users of their applications.
In the near future, Gelfenbeyn says Speaktoit also plans to make available an SDK for Apple iOS and Google Android devices that will enable speech recognition to run locally. With the rise of multicore processors, there is now enough raw compute power available to support those applications running on, for example, a mobile device.
But even then, Gelfenbeyn notes that the complexity of many of the algorithms associated with speech recognition will still require developers to invoke the capabilities of an external cloud service to effectively process them. The Speaktoit service is based on machine learning algorithms that can be applied to various conversation scenarios and enrichment of natural language requests using prebuilt knowledge bases developed by Speaktoit.
Gelfenbeyn says the “REST-like” API developed by Speaktoit will enable developers to embed much richer voice-recognition capabilities inside their applications than what Apple is able to deliver via Siri on the iPhone. That’s critical, says Gelfenbeyn, because in order for voice recognition to gain broad acceptance, the user experience needs to be as human as possible. That issue, adds Gelfenbeyn, will only become more pressing as a new generation of wearable devices that to one degree or another will rely on speech recognition comes to market.
Speech recognition is a major industry challenge that individual developers might want to be wary of taking on by themselves. There’s no doubt that demand for speech-recognition applications exists. But speech recognition is one of those things that developers would best do not at all versus delivering a substandard experience that only winds up doing more to annoy end users than to help them.