This article was published as a part of ProgrammableWeb's Sponsored Content Program. The opinions expressed here are those of the underwriter do not necessarily reflect the views of ProgrammableWeb or its editorial staff. For more information regarding ProgrammableWeb's Sponsored Content Program, please consult our FAQ.
Image recognition is not impossible, but it isn’t easy.
The goal of image recognition is to extract useful information from images. The act of recognizing images has proven to be one of the most difficult challenges the computer industry has faced. The fact that we humans can recognize images and scenes with apparently very little effort does not correspond to the massive effort required to create a computer program designed to achieve the same outcome. While image recognition seems easy to us, consider that more of the human brain is devoted to vision than anything else. Vision is an amazing feat of natural intelligence.
Today, with social media and mobile phones with cameras, we are producing snowballing amounts of digital images and videos. This explosive volume creates significant opportunities to recognize and tag digital assets, creating a foundation for a vast array of layered services.
Companies—and especially their clients/users—often do not know how to find and access media they created or acquired and saved. There is simply too much content to manually organize and analyze. Providers of image recognition solutions can help these companies and their users understand and utilize that content better—regardless of whether the content is images or video.
Understanding the content within an image is incredibly valuable when considering all the potential uses of this knowledge. The tagging and classifying of the data helps with consistency of definitions in semantic relationships. This data helps in providing clarity of relationships and origination, providing additional context and enterprise knowledge. Ultimately, it is important that this data is understandable, sharable and accessible over time.
For example, an important opportunity for developers, and a key application, is the management of self-produced content. Today people have multiple collections of personal digital pictures. These pictures must be stored, shared, sorted and retrieved based on people or items in the images and their context. Pictures may be tagged when saved or indexed in context when users formulate a specific request.
Object detection and recognition techniques identify faces, houses, trees, buildings or other smaller objects while recognizing human beings, animals or scenes within the overall image. Key object recognition provides useful tagging for retrieving information in relative and logical context.
As image recognition technology has emerged, there are numerous uses and services available across almost any industry. Computer Vision is useful in many areas, including safety, health, security, comfort, fun and access. Specific examples include:
- object recognition
- search and e-commerce
- industrial automation and inspection
- surveillance and security
- visual geolocation
- face recognition, people, eye and head tracking
- film and video: sports analysis (sport vision)
- astronomy and outer space applications
- medical and biomedical image analysis
- autonomous vehicles, automotive driver assistance, road surveying and traffic management
- panoramic photography
- mobile, web and cloud applications
- gesture recognition
- games, virtual reality, augmented reality
- and much more …
How you can apply this technology within your business or industry
First, consider some generalized, broadly applicable business processes which could benefit from image recognition. They might include: business intelligence and knowledge management, content management, enterprise resource planning, portfolio management, or customer relationship management. These processes include information, operations and management. From moving or processing data to policies and procedures to team and organizational dynamics, visual computing exponentially increases the intelligence available to any organization.
There are also industry-specific business processes that can be enhanced with image recognition. The following industries derive definitive benefits from advancements in image recognition:
- e-commerce, advertising and search (object recognition, content based image retrieval)
- mobile and wearable technologies (eye tracking, object recognition, geocaching, virtual reality, augmented reality)
- gaming (i.e., MS Kinect tracks body movement in real time, and more)
- automotive (wide variety of application)
- academia/government (grant-funded research)
- financial services (investment banking)
- healthcare (patient health records)
- libraries/archives (digitization)
- pharmaceuticals (clinical drug trials)
- semiconductors (microprocessor design)
- broadcast television (asset management, archival and retrieval)
E-commerce, advertising and search
Image recognition is creating a new realm of mobile commerce. Today, image recognition technology can be used in mobile applications to identify specific products, providing potential customers with a significantly more interactive view of the world around them while making everything they see searchable and therefore buyable. The world at large is your virtual showroom.
The foundational layer for creating unique solutions across any industry involves fundamental image recognition. The company that has achieved the highest accuracy levels, above and beyond all competitors in its space—including Google—is Image Searcher Inc., a hot startup in Los Angeles. Image Searcher’s recently announced CamFind API has received a warm welcome from many developers.
CamFind’s image recognition technology identifies objects like shoes, handbags, sunglasses and watches in photos taken through a tablet or smartphone. This technology makes a new form of mobile commerce possible today, allowing shoppers to see products while they are in use and return search results as well as e-commerce purchasing options on the spot. Imagine attending a social gathering, walking through an airport, or visiting another country where you see a beautiful handbag or any item that you’d like to have. You pull out your smart phone, take a picture and the app returns not only an identification of the object in the photo, but also returns search results including links to purchase the product.
Developers today can build a layer of services on top of the CamFind image recognition layer to produce their own mobile commerce applications. For example, a retailer of mountain bikes could offer an app that would recognize a particular bicycle out on the trail, or an online retailer of computer accessories might offer an application that recognizes component makes and models so shoppers could order parts, download printer drivers or even see user manuals.
Similarly, the prospective buyer can conduct live price comparisons. For example, say a potential customer in a store wants to find out if a product is a good deal. He or she would simply snap a picture of the product and the developer’s solution will give the user the knowledge necessary to make the best buy.
This technology exists today. You can test it with the CamFind mobile app (on both iOS and Android) and access the identification services with the CamFind API. Who will build the first advertising model for search results based on photos taken?
The emerging field of wearables
Google was the first company to prominently announce a wearable computer: Google Glass. Although they have not yet been released for sale to the public, they have grown quite popular and achieved a tremendous amount of publicity. Google Glass has been available to developers that were part of the Explorer Program and Google published guidelines on how to develop for glass here. Even so, there are a variety of hardware manufacturers planning to introduce their own glasses and other wearable devices.
There is speculation that Microsoft, Sony and Apple are working on their own versions, as well as Oakley and a host of companies you have never heard of. According to The Korea Times, Samsung will show off glasses at this year's IFA trade show in Berlin, which runs from Sept. 5-10.
One very interesting solution, called iPal, is currently featured on IndieGoGo. With iPal, you can capture pictures and videos with your eyes and share them with a blink or a wink. From an imaging perspective, Google Glass is only a head-tracking camera that has no idea what you’re looking at, and pointing it properly is harder than pointing a smartphone with your hands.
On the contrary, iPal is the first-ever smart glass with eye tracking and eye gesture controls. iPal sees what your eyes see and do. No other camera offers this convenience. iPal can see a scene in wide or narrow angle. Additionally, it can automatically zoom in on what you pay attention to. Your eyes are the most convenient and efficient tool to capture images or video, and you can also use eye gestures to share what you capture on Instagram, Facebook and other social media.
The world of gaming will be revolutionized by image recognition and computer vision technology. In fact, this revolution is already well on its way. The Microsoft Kinect video game tracks the human body in real time, and holds The Guinness World Record for the fastest-selling consumer electronics device ever.
As game play begins to transition off the device and take place in the real world, image recognition holds significant promise in generating new types of user experiences and user interfaces. Combining image technologies mashed up with geotargeting and in-app purchasing, search-based commerce or advertising begin to transition into the real world, opening the doors to incredible AdWords-sized, off-device business opportunities.
The automotive industry
Fully automated cars are becoming a reality. Central to the artificial intelligence that drives the car is visual computing and image recognition. The automotive industry is actively exploring solutions to gain competitive advantage with innovative technologies. On June 4-5, Telematics Detroit, the world's largest forum dedicated to the future of connected auto mobility, will address the crucial technologies and business questions shaping telematics as connectivity enables a new paradigm for auto mobility.
Have you test-driven a car with a lane departure warning using white line detection? Perhaps your car provides detection of obstacles in front of it using stereo images or includes pedestrian detection and warning using infrared images? Vehicle safety is continuously being enhanced, preventing potential traffic accidents by analyzing and predicting the motion of pedestrians or other cars. The technology is even capable of reading road signs and stop lights. Don’t be surprised if the car begins to warn you of proximity to guardrails or crosswalks. The possibilities are merely beginning to emerge.
Future uses and benefits
The image recognition industry is growing incredibly fast. According to MarketsandMarkets, a global market research and consulting company based in the U.S., the image recognition market will be worth $25.65 billion by 2019, up from $9.65 billion today.
With ever-increasing attention being brought to the world of computer vision, coupled with major leaps in computer processing, the future of image recognition is very interesting. It is certain to open doors for uses which have not yet been imagined.
This industry is still in its infancy, yet the solutions already available today are changing the way humanity interacts with computers. User experience continues to become more and more adaptive to human nature. Every advancement provides an exponential array of new solutions. We can hardly imagine where this technology can take us.