Google has announced the general availability of Cloud Text-to-Speech and a beta release of Cloud Text-to-Speech Audio Profiles. The company also announced updates to Cloud Speech-to-Text which include the addition of multi-channel recognition, speaker diarization, and language auto-detect.
- Articles (7)
- APIs (37)
- Mashups (2)
- SDKs (17)
- Libraries (0)
- Sample Source Code (9)
- Followers (2)
- Developers (2)
The following is a list of ProgrammableWeb articles that matched your search term. On an nearly 24/7 basis, ProgrammableWeb publishes new articles ranging from news to opinion to tutorials for both developers and API providers. All of our articles are categorized in such a way that you can find your way to related articles, APIs, SDKs, Libraries, Frameworks, Tutorials and Sample Source Code. If you have an interest in contributing any of the aforementioned content to ProgrammableWeb, be sure to read our guidelines for such contributions.
In this, developer-blogger Alex Kras shows us how to overcome the 60 second audio file limitation of the free tier of Google's Cloud Speech API by taking a longer audio file, breaking it up into short chunks, and then cycling through those chunks to make a complete transcription.
Using speech-to-text transcription service Tropo with VoiceBase’s audio indexing API, this tutorial shows followers how to create advanced transcription with analytics.
TranscribeMe and MindSwarms, two San Francisco based technology start-ups, have announced that the companies have formed a partnership to provide consumer insights through speech processing of video content and to provide searchable and shareable video consumer feedback.
The CastingWords API allows you to put their 100% human transcription service into your company's workflow. The API's technical webpage describes the API as "RESTish," with responses returned in a variety of formats. The site states that, "The goal is not to follow REST's practices closely, but rather to make an easy API to get up to speed with using a dynamic language or curl.
In this age of automation and separation it is surprising to find human components of automated services. QuickTate believes that humans are still better than computers at speech recognition and that’s why the QuickTate API is built around living breathing typists. The company has taken the old model of a transcription service and revamped it to take advantage of the internet age that we live in: typists are distributed rather than co-located and jobs can be submitted via API from any number of applications.
National Public Radio (NPR) has just opened another means for developers to access content from NPR.org: a new API for transcripts. This API provides access to tens of thousands of transcripts from some of the most popular programs on NPR. As we covered last year and in our NPR API Profile, their APIs open-up a range of interesting possibilities.