Continued from page 3.
Now, every time a blank value is encountered, the extractor displays the value, Unknown,in the State column, as shown next in the Figure 20 callout (2).
Figure 20: Setting a default value for a column avoids having rows of blank data.
Let's review the process for editing an extractor.
You've selected an extractor and set it into edit mode using the Import.io Dashboard. You deleted all the columns from the extractor and added new ones for Title, Description, City, and State. You trained the extractor to identify values for Title, Description, City, and State. You applied regular expressions to the values in the City and State columns to show the data relevant to the given column. Also, you provided the default value Unknown,in both the City and State columns.
Now you'll run the extractor and display the result in a Google spreadsheet.
Creating a Simple Report for Google Sheets Using Import.io
Import.io allows you to render an extractor's data in a variety of formats. One format is a directive you insert into a Google spreadsheet cell that will populate the sheet with the extractor's data in tabular format.
To get the directive go to the Dashboard for you Import.io account. Select your extractor from the menu on the left side of the Dashboard web page. On the top of the Dashboard you'll see the Integrate button , as shown next in Figure 21 callout (1). Click Integrate to display the variety of rendering formats available for the selected extractor. Copy the format into the Google Sheets text area, which is on the lower part of the Integrate page as shown in Figure 21 callout (2).
Figure 21: Import.io allows you to display data in a number of ways (JSON, CSV, and in Google Sheets).
Figure 22: Import.io creates a IMPORTDATA directive that will insert an extractor's data as row and columns in a Google Sheet.
Hit the Enter key on your keyboard to get the pasted directive to invoke. The spreadsheet will populate with the data from your extractor in tabular format. (Please see Figure 23.)
Figure 23: The IMPORTDATA directive inserts data according to all the fields associated with the given extractor.
The IMPORTDATA directive constructed by Import.io will dump all columns associated with the related extractor, even the "hidden" ones you didn't define when you created the extractor. This is OK. If you want to make the Google Sheet to conform to your custom specification, you can simply hide the column. Do not delete any columns. Just hide them. The internals of the Import.io require that all columns imported using the extractor's Google Sheet directive be available in the spreadsheet; however, not all need to be visible. (See Figure 24.)
Figure 24: You hide unnecessary columns to get the Google Sheet to conform to your display specification.
Once you have your extractor's data in a Google Sheet, you can do things such as create charts (as we did earlier in the article, up at Figure 3).
Referring back to Figure 21, in addition to having extractor data available to Google Sheets, you can also take advantage of the fact that Import.io can emit your extractor's data in JSON format (the third of the four options displayed in Figure 21). Using the export as JSON feature of Import.io in conjunction with the Import.io API provides a dimension of use that makes applying Import.io to your application development needs easy. Those of us who use APIs as the mainstay of our programming activity will really appreciate this.
Let's take a look.
Creating an Aggregation Report Using the Import.io API
Referring back to Figure 21, you can see that another option is the Live Query API. In other words, Import.io provides a REST API that allows you to do everything that's possible through the the target web page's UI and more. For example, you can work with the Import.io API to aggregate information from many extractors (each connected to different websites) into a single feed, which is shown next. The way you'll do this is to create a Job Listings website named Job-Track-O-Tron. The website will allow a user to view a list of extractors from which to choose. Then, once he or she chooses an extractor, the user can click to show the job details from a list of jobs that are particular to the extractor. A user can also view an aggregated list of the jobs from all the extractors. (Please see Figure 25.)
Figure 25: The sample application, Job-Track-O-Tron, gets job lists and job details using the Import.io API.
|We've created a short, 4 minute video that describes concepts behind the architecture and components for the Proxy-Client application. You can view the video here.
Getting an API Key
In order to work with the Import.io API, you need to have an API Key. You're assigned an API Key when you create an account with Import.io. Your API Key is displayed in your account profile, as shown next in Figure 26.
Figure 26: Your Import.io API Key is displayed in your account profile.
Once you have an API key in hand, you have complete access to Import.io's REST API. You'll use the API key every time you use an API call to an Import.io endpoint.
Working with the API
All data used by the Job-Track-O-Tron website resides within Import.io As mentioned previously, the purpose of the application is to list all the Job Site extractors you've created in Import.io and then to be able to drill into the details of a site's job listings. Also, the sample application, Job-Track-O-Tron, will allow you to list the data in all of the extractors at once.
To gain a full understanding of how Job-Track-O-Tron works you need to understand the parts that make up the entire application.
(This article is paginated. Use the pagination controls below to retrieve the next or previous page)