The DigitalOwl Text Similarity API allows developers to programmatically evaluate the semantic similarity of two articles. Similarity is expressed as a number. These articles may be submitted to the API as plain text or URLs. This API is RESTful and requires an API key.
This API is provided by DigitalOwl, a software company that offers natural language processing services.
The underlying algorithm is base on one of the recent breakthroughs in the file of ML called word vectors. It is able to find synonyms of words which is considered the closest we have ever been to teaching computers to understand human language.
Using this we are able to find relation between articles the not only use syntactically close words but also close by meaning. This drastically improved the accuracy.
Example: Someone steals an article, in order to not get caught he tries to shuffle the paragraphs, that is easy to deal with programmatically, but if he starts replacing word with synonyms or rewrites whole paragraphs, things start to get more tricky. Luckly this is where this service shines, it will determine with ~98% accuracy that this article was stolen.
This is just one example, there are many more.
The rest api is designed to be easy for use, all you need is two urls, or text to evaluate and you are good to go.