OCR stands for Optical Character Recognition and provides a way to read letters and numbers off of images, handwritten notes, invoices and receipts, videos, or any other visual media and convert it to machine readable text.
This is obviously a useful technology for developers of all kinds of applications, including accounting, ecommerce and retail, law, identifying vehicles, healthcare, banking, translation, and dozens of other uses. And the way to integrate OCR services with applications is using Application Programmaning Interfaces, or APIs.
The best place to find suitable APIs that can enable applications to decipher text is in the ProgrammableWeb OCR category.
In this article, we detail the ten most popular OCR APIs based on reader page views on ProgrammableWeb.
1. Free OCR API
The Free OCR APITrack this API parses images and multi-page PDF documents (PDF OCR) and returns extracted text in JSON format. The API can be used from any internet-connected device including Android and iOS mobile devices and IoT devices.
2. Intento API
Intento (Inten.to) provides a single interface to several cognitive AI models and vendors. The Intento APITrack this API provides JSON responses for translating text, tagging images, sentiment analysis, Optical Character Recognition (OCR), and speech transcription. For OCR, it covers technology from ABBYY, Google Cloud Vision API, and Microsoft Computer Vison APIs.
3. CAPTCHAS.io API
CAPTCHAs.IO is an automated captcha recognition service that supports more than 30,000 image captchas, audio captchas, and reCAPTCHA v2 and v3, including invisible reCAPTCHA. The CAPTCHAs.IO APITrack this API provides RESTful access to all of CAPTCHAs.io's captcha-solving methods. Developers can choose to get API responses in either JSON or plain text.
4. Russian Car Number Plate Lookup API
Russian Car Number Plate Lookup APITrack this API allows users to use a smartphone camera to capture the number plate (Gos Nomer) of a Russian car, and receive back make and model, year, an indicative image of a car, a partial VIN number, and if the vehicle is wanted by the Russian Police (Gibdd). This is designed for use in automotive websites targeted at the Russian market.
5. Google Cloud Vision API
Google Cloud Vision APITrack this API gives developers access to image recognition, processing, and analysis tools. The service classifies images into thousands of categories (e.g., "truck", "tiger", "Empire State Building"), and detects individual objects and faces within images. It's OCR component is able to detect more than 50 different languages, including mixed language text. It also deciphers handwriting. It can recognize text in several image formats including TIFF, GIF, PDF, JPG and Animated GIF.
6. Cloudmersive Optical Character Recognition API
Cloudmersive provides scalable, computer vision and natural language APIs. The Cloudmersive Optical Character Recognition APITrack this API allows users to convert scanned images of pages into recognized text. The API uses Machine Learning to automatically pre-process and then recognize the text across over 90 languages. It also unskews and rotates images, and can automatically segment documents and receipts out of photographs.
7. Mathpix API
The Mathpix APITrack this API enables users to solve mathematical equations via OCR technology. With the API, developers can implement image processing, systems of equations, matrices, long divisions, problem numbers, graphs, and geometry diagrams. The app can be downloaded for Android and iOS. This API supports scientific notation as used in chemistry, math, physics, computer science, economics, and other STEM subjects.
8. Captcha Solutions API
Captcha Solutions is a CAPTCHA decoding web service offering solutions based on a flat rate per CAPTCHA solved. This RESTful APITrack this API is designed to solve a large variety of a CAPTCHA challenges for a broad spectrum of applications. It operates on a 90% code based OCR system and 10% human team.
9. NewOCR API
The NewOCR APITrack this API can convert text on JPG, PNG, GIF, BMP, TIFF, PDF, and DjVu files into text. The API features free and paid OCR services for 122 recognition languages and fonts, selection of area on page for OCR, auto-rotation, several ways to display and process resulting text including copy to clipboard and edit in Google Docs, several input formats including multipage PDF or multiple images in a ZIP archive, page layout analysis, mathematical equations recognition, and supports poor scans and low-resolution images.
10. API Den OCR
API Den provides APIs for B2B services. The API Den OCR APITrack this API enables applications to have OCR capabilities including support for more than 120 languages. OCR convers images of typed, handwritten or printed text into machine-encoded text. The service reads photos, documents, scene-photos, passports, bank statements and everything else, and can display in plain text or XHTML format.