How to Access Any RESTful API Using the R Language

Continued from page 2. 

For this API request, one of the pieces of information you received in the original GET request was the number of pages. So, let's initialize the pages variable with that value:

pages <- get_prices_json$total_pages

That code selects the "total_pages" item from the list of data returned by the aforementioned API call. The following code is a for loop that gets each page of data:

for(i in 2:pages){
 
 #Making an API call that has page_number= at the end. This will increment by 1 in each loop until you have all pages
 call_2 <- paste(base,endpoint,"?","ticker","=", stock,"&","page_number=", i, sep="")
 
 #Making the API call
 get_prices_2 <- GET(call_2, authenticate(username,password, type = "basic"))
 
 #Parsing it to JSON
 get_prices_text_2 <- content(get_prices_2, "text")
 
 #Converting it from JSON to a list you can use. This actually gives you a list, one item of which is the data, with the rest is information about the API call
 get_prices_json_2 <- fromJSON(get_prices_text_2, flatten = TRUE)
 
 #This grabs just the data you want and makes it a data frame
 get_prices_df_2 <- as.data.frame(get_prices_json_2)
 
 #Now you add the data to the existing data frame and repeat
 get_prices_df <- rbind(get_prices_df, get_prices_df_2)
 
}
Figure 6. In this example, there are 93 pages of data, so you'll make 93 API calls to get each page.

Figure 6. In this example, there are 93 pages of data, so you'll make 93 API calls to get each page.

The loop starts with the second page of data by adding page_number=2 to the end of the code, and appends the second page to the first page using rbind. Then it repeats until all of the pages have been returned. This leaves you with a nice neat data frame with all of the data you requested that R can then analyze.

Figure 7. Notice that the data frame has grown from 100 rows to more than 9,000 as each page of data has been added.

Figure 7. Notice that the data frame has grown from 100 rows to more than 9,000 as each page of data has been added.

Applying the same methodology to other APIs

The basic process for accessing many RESTful APIs is the same: Use the httr and jsonlite packages to make a GET request, parse the results, and page through all of the data. This requires converting the raw data from the GET request to JSON and then into a parsed data frame. The only difference in methodology across APIs is that some APIs have a different approach to paging.

Intercom's API, for example, has a scroll parameter. Your first API call will return this as a character that you can add to subsequent calls to get more data. The Stripe API returns a "has_more" parameter that works in conjunction with a "starting_after" parameter in your API call. Instead of a for loop, you can write a while loop:

while(get_request[2] == "TRUE"){

That will repeat your API call, "starting_after" the last call you made until has_more is false. As long as you can adapt the paging methodology of the API you'd like to use, you can use these techniques to access just about any API in R.

Andrew Carpenter I'm an author, I/O Psychologist, and budding data analyst. When I'm not living in a fantasy world, I'm at Intrinio trying to redefine HR Analytics. Intrinio's mission is to make financial data affordable and accessible so investors can save money and make time to build something meaningful. You can view examples of books I've written here: https://www.amazon.com/Roll-Honey-Andrew-James-Carpenter/dp/1537174711?_encoding=UTF8&psc=1 https://www.amazon.com/Blood-Born-Saga-Book-ebook/dp/B00GM8WYIK/ref=sr_1_1?ie=UTF8&qid=1495128022&sr=8-1&keywords=the+blood+born+andrew+carpenter
 

Comments (3)

srinivas-kasiboyina

Sir, Thank you very much. I have gained a lot on APIs in R from your Blog  Post.

Thanks a lot.

Srinivas K

Andrew Carpenter

Srinivas,

Glad you liked it. I spent a lot of time learning how to do it the first time but now that I know how to make 1 API call, every other API is pretty easy. 

It makes me happy that other R users are learning from this. When I was learning, articles and posts were supper helpful, so its nice to give back.

Andrew

Saravana kumar

Hi,

I have a web scrapper code in R which crawls data from API and feed those data to MS SQL SERVER for further analysis.

Each and every time if I run the code it retrieves all the data from API and feed it to database but I want to pull only newly added data in API and feed those into database to avoid API transaction overflow.

Is there any way to do so inside the R code or any add-in available?

Yours answers would be much appreciated.

Cheers, Saravana,Email: sarajki333@gmail.com