New York Times Gives You 2.8 Million Articles via an API

John Musser
Feb. 05 2009, 03:26AM EST

The big API day is finally here for the New York Times. After launching a series of interesting and useful APIs since last fall, covering everything from campaign finance to movie reviews, they've now released their most important API to date: their Article Search API. With this new web service, developers now have access to 2.8 million articles from the paper of record dating from 1981 through today (our Article Search profile). This API should be a terrific source for a wide range of mashups and third party applications.

Their announcement gives some good details about what you can do:

  • Find recipes that have an associated thumbnail image
  • Find the first occurrence of "internet"
  • See instances of the phrase "job loss" by month for 2008
  • Search for the phrase "stock market" in all articles that are marked as a review in the Books section
  • Find articles that mention "Iraq" in the title and have related multimedia
  • Identify articles that appeared on the front page and mentioned "coffee prices"

It's a RESTful API that returns data as JSON (XML support coming later). It supports approximately 35 searchable fields, many of which are outlined in this illustration:

nytimesart

As the name implies, it's a "search api", so naturally there are many types of queries available:

  • Date range: all articles from X date to Y date
  • Field search: search within any number of given fields, e.g., title:obama byline:dowd
  • Conjunction and disjunction (AND and NOT) operations, e.g., baseball yankees -"red sox"
  • Ordering by closest (variable ranking algorithms), newest and oldest
  • Faceted searching

As an emblematic symbol of the challenges the newspaper industry is facing, by opening-up this new distribution channel, the Times shows they're looking at innovative ways to leverage their valuable content in this new platform economy.

John Musser

Comments

Comments(9)

shu

According to their API documentation complete articles are not available. The API will pull "headlines, abstracts, lead paragraphs and links to associated multimedia." Will users have to pay to see the entire article?