IBM has announced the addition of three new advancements to the Watson Cloud platform; real-time speaker diarization (beta) support available via the Watson Speech to Text API, Visual Recognition tagging with a built-in set of visual labels, and the Watson Discovery Service. These new capabilities can be used by developers to add intelligent visual recognition and speech to text capabilities to Web and mobile applications.
Speaker diarization is described by IBM as “the algorithms used to identify and segment speech by speaker identity.” It’s used primarily for speech transcription. Speaker diarization has been used successfully in the past for pre-recorded conversations. The addition of speaker diarization to the Watson Speech to Text API means that developers can build applications capable of analyzing conversations and taking action while the conversation is happening between two people in real time.
Visual Recognition tagging capabilities have been significantly updated and now includes a built-in library of tens of thousands of visual labels which allow the platform to recognize numerous visual concepts. The new visual labels library includes a variety of concepts such as objects, people, places, activities, scenes. Watson Visual Recognition can recognize broad visual concepts and objects in photos and understand visual scenes based on context. Watson Visual Recognition also features custom training and classification capabilities.
The Watson Discovery Service converts, normalizes, and enriches streams of data so that the content can be analyzed to gain insights, discover patterns, and contextualized. Developers can upload their own sets of data to the service or use publicly available datasets. The Watson Discovery Service enriches data using integrated Watson APIs such as the AlchemyLanguage API and Document Conversion API.
For more information about the IBM Watson Cloud platform, visit the official IBM Watson Developer Cloud website.