This guest post comes from Knarig Arabshian, Member of Technical Staff at Bell Labs, Alcatel-Lucent in Murray Hill, NJ.
Knarig is interested in creating Semantic Web technologies toward establishing a personalized network of services. She does research work in context-aware computing, service discovery and composition. For more information about her work, you can take a look at her site at http://www.knarig.com or contact her at firstname.lastname@example.org.
Alcatel-Lucent is the parent company of ProgrammableWeb
Service discovery is not an easy task in today's Web. Discovering an API requires searching through a large number of services on the Internet and then reading pages of documentation to figure out how to use the ones that may match your application. This is the case in Programmable Web as well. The API directory shows over 5000 APIs which are manually categorized in over 50 service categories.
And that's not all...there are around 80 APIs coming in per week and each of these are manually identified and categorized within a single service category. So how can we improve the search and classification?
One way is to use an ontology to describe the APIs. An ontology is a shared understanding of a domain of interest. It describes the meaning or semantics of a domain with a formal model so that machines can understand this data and process it automatically. The main components of an ontology are classes, properties and individuals. A class represents the domain, the properties represent the set of attributes for that domain and the individuals are actual instances contained within the class. Think of how object orientation works: a class contains variables and methods and instances can represent that class. The analogy with an ontology would be: OO Class → Ontology Class; OO Variable → Ontology Property; OO Instance → Ontology Individual.
However, there is something an ontology provides that object orientation and databases do not and that is reasoning. Reasoning allows information to be inferred, given the existing information in the ontology. Since ontologies are description logics and provide a subset of first-order logic axioms, it can represent class relationships such as: equivalence, subclass, superclass, or disjointness. An ontology class can also have existential or universal restrictions placed on properties. A reasoner processes these axioms and infers relationships that may not have been explicitly stated. For example, reasoners can infer that one class is equivalent to another or that an instance belongs to one ore more classes.
Taking an example directly from the ProgrammableWeb services, we see that each API is manually categorized in a single category, even if it had a set of attributes that matched two categories. However, if we use an ontology, the APIs will be classified in all the categories that they logically belong in. The diagram below shows an example of such a classification. Services such as Flickr, Facebook and Picasa provide both social media and photo services; Tripadvisor also provides a travel service with social media capabilities and so on.
So the idea here is that if we describe ProgrammableWeb APIs with high-level attributes using an ontology, we can enable automatic classification of services as well as improve service discovery because registration and querying will be done based on the service attributes.
The main problem we had to tackle was to create a high-level ontology that described ProgrammableWeb services. Typically, a domain expert creates an ontology for a domain. ProgrammableWeb, however, has APIs from a broad set of domains. We did not have access to experts from each domain, so we took a different approach, creating a tool that uses a combination of web-based domain knowledge, the ProgrammableWeb directory of API listings, and Natural Language Processing (NLP) to identify top terms and interesting phrases for each ProgrammableWeb category and API. We call our tool LexOnt. It is a plugin for Protege, the de facto ontology editor. The result is an ontology specified in the W3C’s Web Ontology Language (OWL). For further information on LexOnt, you can take a look at a paper we published at the AAAI Spring Symposium for Intelligent Services meet Social Web or watch a video of our demo.
With LexOnt, we created a number of high-level features that described the APIs and consequently, the service category. Below, are a few snapshots of the PW ontology. The PW service classes are created to be subclasses of the PWService class and are renamed to have an “_Service” appended to it to signify that they are service classes.
Every service category has a hasFeature property that takes values from the Feature class. The Feature class describes high-level features of a given service category. The example below shows a number of high-level features of the Advertising service category.
With this ontology description, we can now represent APIs as ontology individuals. Let's take two API examples from ProgrammableWeb. 140 Proof is categorized in the Advertising class and Badgeville is categorized in the Social class. However, when analyzing the significant terms and phrases we saw that they shared both social and advertising features, namely, they were social advertising APIs. So we created a Social_Advertising_Feature class as a subclass of both Advertising_Feature and Social_Feature and assigned the hasFeature property to this value. With this assignment, they are automatically categorized in both the Advertising_Service and Social_Service classes.
We have currently completed the ontology description for five service classes: Advertising, Social, Travel, Real-Estate and Utility. Next, with this description, we plan to generate a form-like interface so that users can now discover services with semantic queries. Queries such as: find me an advertising service for social networks; or find me a social networking service for book sharing can be issued which will return a filtered and specific set of results.