Here is an interview with Brandon Wirtz, CTO of the very exciting text mining API startup Stremor.com. We have covered Stremor here before. Liquid Helium API is the core of all Stremor's products and it extracts information about sentence and paragraph structure, word usage, parts of speech, grammar, writing style, punctuation, and author bias. Stremor also has a new kind of search engine called Samuru based on it's language APIs.
PW- Describe the Automated Summaries API . How is it different from Summly that recently got acquired by Yahoo?
Brandon- A big difference between our API and Summly is that you can build your own apps on it. But the real difference is that we own the tech full stack Summly relies on Readability to extract content from a web page, CoreNLP for splitting sentences, and SRI for scoring which sentences are the best based on keywords. We have our own content extraction sentence disambiguation, and scoring system. Our scores aren’t based just on keywords, we look at factors that tell you Who, What, Why, Where, when and how. We try hard to make sure our summary includes Who, What, and Why.
PW- What are some usage stats for your API- volume , calls, tasks and customers handled?
Brandon- We haven’t been public for very long, so our Mashape offering doesn’t have impressive number yet, but the same API that Mashape users hit is used by our TLDR plugin, TLDR Reader, and Samuru.com Search Engine. We are currently summarizing about 800k pages a day.
PW- What are your plans for Mobile computing ? What are some of your plans for Social Media summaries?
Brandon- The technology certainly lends to mobile and social. But we don’t have big plans in either. We are building tools for others to build on. Along the way we are releasing products that use the tech so others can see how they could use it. Even Samuru.com our search engine has an API so others can build something far better than what we would. Our focus is language. There are a million places our tech could help, from email summaries as previews, generating meta description tags for your blog for SEO, creating brief abstracts for reports and internal documents. These are all things we would love to see, but we are a small team and to keep focus we need to do one thing, and do it well, and that is process language.
PW- Describe the company philosophy and vision for Stremor. What plans do you have for 2013?
Brandon- 2013 is all about integration. We are working with partners to integrate our technology in to theirs. Summarization is really just the tip of the iceberg in terms of what we can do. The ability to extract facts, stats, quotes, and other data from text is invaluable in data mining and research. The ability to detect changes in an author’s vocabulary, optimism, and frustration levels can provide information on employee well-being, plagiarized content, sponsored posts, and even predicting market trends.
PW- What is the Liquid Helium API?
Brandon- The Liquid Helium API is a much richer API that is available to premier customers. It takes all of our capabilities and puts them in one place. For example, the summary API doesn’t tell you if the author was for or against the topic, nor does it tell you if it is good news or bad. We don’t typically expose our content classification system that tells if a document is instructional, opinion piece, chronology, factual account, or something else. All of these features are available in the Liquid Helium API, and more.
PW- What is your target segment for paid customers
Brandon- Content producers and aggregators are the primary target for the Summarization technology. The full Liquid Helium suite is targeted at Business Intelligence, Data Miners, and Market Researchers.
PW- How are you creating and enabling developers to get excite for creating applications and mashups for your API
Brandon- We included a free tier for all of our Mashape API’s so users can play to get started. The RSS summarizer makes building a niche content phone app really simple. The Search API makes building your own custom search engine easy. Combined with data from other sources there are a lot of simple to build projects that would be really interesting. I built for my own use an RSS feed that summarizes stock news about my portfolio. I used news search from our search API to get the news, then use the keyword API to find keywords in those stories so that I could see trends that were impacting more than one of my stocks. Xbox One announcement impacted my Microsoft Stock, and my Sony Stock, but didn’t impact my Tesla stock. In about 2 hours of playing I was able to build a tool that helps me know if I should panic when my stocks move, or if the movements are unrelated to each other. I think putting this tech in the hands of hundreds if not thousands of developers will enable some of them to come up with even better use cases.