New York Times API Recap

Bill Albright
Apr. 03 2009, 12:45AM EDT

The tagline of the Times Open blog is "All the news that's fit to printf()", and that clever play on the paper's motto gains more credence with each new API released. In February the Times introduced the Newswire API, which "provides an up-to-the-minute stream of published items" from the paper of record, and the New York State Legislature API for tracking the political maneuverings in Albany.

This brings to nine the number of different APIs in the Times Developer Network, and all have been introduced in the last six months. The use of APIs by news organizations has risen rapidly in the past year as media companies have focused on their Internet businesses in the transition from print. The APIs of the Times fall into four basic categories: news remixing, niche topics, civic data, and social activity.

News remixing: stream of recent articles, article search, and tag categorization

The Newswire API is a significant step forward, in that it reports each article with associated multimedia assets and categorization metadata as soon as it is published on nytimes.com. It was announced at the first Times Open API conference in February (see Tim O'Reilly's keynote slides here). This stream excludes any content (like a wire service story) that doesn't originate with the Times, and will include blog posts in a future version. The results from the API call do not return the full text of the article, but the summary and metadata includes headline, abstract, section, column, and faceted terms as described in the TimesTags API. With the Newswire API there is no filtering by keyword, only by how recently the article was published.

With the Article Search API a developer can mine all the articles published in the paper and website back to 1981, using keyword and faceted search. Like the Newswire API, the Article Search API doesn't return full text, but it does include a portion of the text and a link to the online article, along with a full set of summary and categorization information.

The TimesTags API opens the taxonomy of 27,000 tags used to identify Times Topics. This classification system is organized into four dictionaries - descriptive, people, organizations, and geography. Dave Winer's list of recent topics shows a sample of the kinds of individual and topic tags returned. Programmers can use it standalone to search for terms based on character strings in one or more dictionaries, or as an input in the faceted search of the Article API, as described by Ian Kennedy.

Tim Schwartz's Command Center

Mashups that remix text include Aaron Straup's topic connections, Taylor Barstow's Times Explorer, and my own test project the Suburbified real estate app. Designers and artists have mined the data in novel ways: Jer Thorp has developed year and keyword Processing visualizations, and a news alert even a dog can respond to. Tim Schwartz's world visualization and very cool sculpture put geography and language into historical context.

Niche topics: books and movies

With the Best Sellers API a developer can query the different Times book bestseller lists for specific titles, authors, or publishers, and get full rank history for any titles.

The Movie Reviews API allows for searching reviews by keyword or reviewer, and limiting results to those that are Times Critics' Picks or on the Times' list of best 1000 films.

In the future more topics will be serviced by their own API, possibly including event listings, weddings, and real estate. Some notable mashups mix movie reviews with Netflix and best-seller info with Amazon.

Civic data: campaigns, votes, and committees

The Campaign Finance API is based on public data from the Federal Election Commission about election contributions and expenditures, aggregated into useful segments of candidate, zip code, state, and individual donor. The API is currently only for presidential campaign data, but future versions will include House and Senate information also.

RepresentationThe Congress API gives access to member information and voting history for both houses of Congress. The member information includes lists of members, committees, roles, and biographical background. For each legislator a developer can retrieve roll-call votes (not voice or division votes, which Congress doesn't record), missed votes, and the number of times the individual voted with their party.

The New York State Legislature API extends to the Empire State (once said to be run by "three men in a room") the same type of civic-minded information as the Congress API (member details and committee information for Senate and Assembly) minus the voting data, which will be added in a later release.

The Times site has prototypes of the use of these civic APIs including representation for New York City citizens and campaign finance visualizations.

Social activity: profiles, activities, and comments

TimesPeople is the paper's quasi-social network, where a user can recommend articles, comment, and develop a profile. The TimesPeople API gives programmatic access to the profiles, newsfeed, activities, and follow network of the users.

The Community API returns comments made by online users by date, user, or URL of the article being commented on. A future release will include reviews, suggestions and ratings.

One example of a social mashup is the TimesPeople activity gadget that can be used in Google environments to track recommendations or other activity from a member's network or the general public.

Terms and Technology

Each of the nine Times APIs returns JSON, and some have XML and serialized PHP formats as well. Apps must show proper attribution for the use of the content, and are subject to daily rate limits. The Times has also released under open-source licenses some of the tools behind its API platform.

The civic APIs are the only ones that can be used for commercial purposes. All the rest are for non-commercial use - this contrasts with the recent release by The Guardian of its API that will return full text of each article, and can be used in a commercial context, with the requirement to join the Guardian's ad network later on. However the Guardian's API is for now restricted to selected developers, while the Times API is open to all. And the Times has indicated it will develop commercial terms as more information is gathered from the open program.

For more on the use of APIs in the dissemination and monetization of news, see commentary by Jeff Jarvis, Felix Salmon, and the Daylife blog.

With the API initiatives only six months old, there is sure to be continued innovation in the types of data, the usage, and the business models to support the open efforts. On ProgrammableWeb you can follow the news API topic and sign up for watch alerts on individual APIs. Here are our profiles on all the APIs:

Article Search API
Best Sellers API
Campaign Finance API
Community API
Congress API
Movie Reviews API
Newswire API
NY State Legislature API
TimesPeople API

Bill Albright

Comments

Comments(7)

Hey! I know this is somewhat off topic but I was wondering which blog platform are you using for this website? I'm getting tired of Wordpress because I've had problems with hackers and I'm looking at alternatives for another platform. I would be awesome if you could point me in the direction of a good platform. <a href="http://immobilienfinanzierungrechner.hausfinanzieren.org/" rel="nofollow">Immobilienfinanzierung Rechner</a>

An additional issue is really that video gaming became one of the all-time main forms of recreation for people of various age groups. Kids play video games, and adults do, too. The XBox 360 is just about the favorite gaming systems for those who love to have a huge variety of video games available to them, along with who like to experiment with live with other individuals all over the world. Thanks for sharing your opinions.