IBM this week announced the general availability of three Watson APIs — IBM Watson Language Translation, IBM Speech to Text and IBM Text to Speech — bringing the total number of Watson APIs that are generally available to eight.
Jerome Pesenti, IBM's vice president of Watson Core Technology, says that at the moment IBM has another 12 Watson APIs under development for a total of 20 that are in various states of readiness for use by developers.
In addition, IBM has revealed that is now working with the Montreal Institute for Learning Algorithms (MILA), which focuses on deep learning for language, speech and images. IBM has already been working on adding support for images and video to the Watson platform. But alliances with MILA and the recent acquisition of AlchemyAPI are expected to significantly advance those efforts. For example, IBM is working with Memorial Sloan Kettering to use the IBM Watson platform to analyze dermatological images of skin lesions with the goal of assisting clinicians in the identification of various disease states.
In general, Pesenti says that speech APIs are critical for exposing Watson services to a broad number of end users. IBM, he says, has been working on fine-tuning those APIs to enable them to better understand not only language nuances, but also the different accents and dialects that might be found in any given language in order to make interacting with Watson applications more conversational.
IBM Watson itself, says Pesenti, is an instance of machine learning software that over time not only retains information, but can correlate the relationship between different sets of data. While once viewed as a somewhat intimidating advance in artificial intelligence, exposing Watson as a cloud service that developers can call using APIs has gone a long way to make machine learning technologies more accessible to developers.
Of course, IBM isn't the only company investing heavily in machine learning software and algorithms. The race is on to develop APIs that can enable applications not only to talk and hear, but also to see and visualize. As those machine learning technologies mature, it’s only a matter of time before developers will be able to invoke a broad range of machine learning APIs that will bring their applications to life in ways many of them never before thought possible.