The Diffbot Article API automatically extracts clean article text and other article data (author, date, images, etc.) from news article web pages and blog posts.
The Article API works in any language; automatically concatenates multiple-page articles; extracts comments where available using functionality integrated from the Diffbot Discussion API; and also optionally performs sentiment analysis and entity-extraction/tag-generation on the extracted text.
The semantic text mining Alchemy API is now a member of the API Billionaires Club. The service, which makes sense of raw unstructured data, averages 65-75 million requests per day, according to Alchemy's Elliot Turner. That brings the monthly count above 2 billion API requests.
Diffbot has come out of beta announcing the APIs ability to extract content from sites that fit into two page-types: article and front page. The Diffbot engine can determine, just by rendering and looking at a page, what type of page it is. Is it an article or a front page news site? Maybe it's a profile page from a social network. Diffbot’s artificial brain has been literally trained to know the difference. Developers can make 50,000 calls to the Diffbot API per month for free with additional calls available for fractions of a cent. This pricing should encourage wide adoption and experimentation.
Diffbot is one of the coolest new ideas on the internet. The service brings monitoring to the web in a new and interesting way. The company is just about to release a whole slew of projects built on its collection of Diffbot APIs for following changes to web pages and RSS, as well as extracting clean text from websites.