Get Smart with Image Recognition APIs

Photo APIs have long been a staple of developer applications. There are more than 350 photo APIs in the ProgrammableWeb directory and almost 800 photo mashups. However, most applications integrate with photo sharing services, like the Flickr API and Instagram API, missing the real power of photo APIs. This post identifies four ways that APIs are getting smart by using image recognition technology to find faces, words and more.

Face Detection and Facial Recognition

As humans, we're programmed to like seeing other humans. With face detection and recognition, our applications can also be programmed to notice humans—or at least their faces.

  • Face Detection APIs enable you to locate faces within photos, including the pixel coordinates.
  • Face Recognition APIs enable you to train your application to not only detect faces but recognize individuals.

In addition to detecting faces and specific people, some of these APIs also will find specific features (mouth, nose, eyes), orientation (angle of head, whether the mouth is open or smiling), objects like glasses and even guess the individual's mood.

One popular face recognition API gained 40,000 developers before it was bought and killed by Facebook. A number still exist or have sprung up since, including the Betaface API and LambdaLabs Face API. Three more were added in November alone, most recently Ceeq API.

Object Recognition

Recognizing everyday objects within photos is an exciting and expanding area of image recognition. It took a hit earlier this year when Yahoo bought and shuttered IQEngines.

Recognizing things in an image has a lot of applications, most notably wearable computing, as GigaOm wrote in September. AlchemyAPI is known for its textual analysis, but it shared an object recognition demo for the GigaOm story and it's reasonable to expect that might become a full-fledged product.

There are currently two object recognition APIs in the ProgrammableWeb directory: the Dextro API and CamFind API.

Brand Recognition

A subset of object recognition—and perhaps its largest use case—is brand recognition. For example, snap a picture of a wine label and receive a link to purchase. Labels and logos are the low-hanging fruit, but fashion is a common use case.

There are currently 5 brand recognition APIs specifically called out in the ProgrammableWeb directory, including the API, whose video is embedded above.

Text Recognition / OCR

OCR, or optical character recognition, is perhaps the oldest photo recognition technology. Take a photo of a sign—or often an entire document—and get it converted to text. This helps make editable documents that otherwise aren't, such as a printed piece of paper.

There are 10 OCR APIs, including the Full Contact Card Reader API, which is specifically tuned to processing contact information from business cards.

That's four ways that image APIs are getting smarter. Have any others to add?

Adam DuVander is Developer Communications Director for SendGrid and Contributing Editor of ProgrammableWeb. Previously he edited this site and wrote for Wired. You can follow him on Twitter.

Be sure to read the next Photos article: How to create a mobile application using the Flickr Authentication API