Getting structured data out of web pages — often referred to as "web scraping" — is a real need, particularly for people whose job it is to prepare and analyze the information that's available in web pages. Meeting this need is right up the alley of a data extraction tool, such as Import.io.
Recent Extraction ArticlesView all
Named-Entity Recognition involves identifying an entity in a text and assigning it a class label. These classifications can include People, Locations, and Organisations, among others depending on the tool. This comparison looks at the performance of 10 natural language processing APIs.
We've added twelve APIs to the ProgrammableWeb directory in categories such as Home Automation, Holidays, Internet of Things, Editing, and Air Travel. Featured today is an API from Ontotext S4 that can extract text and images from web pages. Here is a summary of the new additions.
Most Popular Extraction APIs (10) View all
||Webhose.io provides on-demand access to structured web data that anyone can consume. We empower you to build, launch, and scale big data operations - whether you’re a budding entrepreneur working out...||Search||11.17.2014|
||Import.io is easy to use web extraction that puts the world’s data in your hands. The import.io API allows developers to access and integrate the functionality of import.io with other...||Data Mining||03.19.2013|
Autodesk Forge Model Derivative
||The Autodesk Forge Model Derivative API allows users to share designs in different formats, and to obtain metadata. It features STL and OBJ support, data extraction, and thumbnail creation. The Model...||Design||06.14.2016|
||The Pinterest Mini API is an unofficial API that allows developers to extract the URLs of pins on any board or page on Pinterest. Pinterest is a social pinboard service where users can "pin...||Images||10.30.2015|
||The Parsebot API allows developers to extract content from a website. The API is useful to embed articles, images, video and audio directly into a developer's applications. Responses in JSON and...||Parsing||09.18.2015|
All Extraction APIs (126)View all
Plasticity Cortex Knowledge Graph
||This Plasticity service provides a way to acces 180+ million facts on over 20+ million entities using the Cortex Knowledge Graph. It allows you to query the graph in natural language or make...||Natural Language Processing||08.14.2017|
Plasticity Sapien Language Engine
||This Plasticity Language API extracts entities, relationships, and context from text using the Sapien Language Engine. This natural language understanding API is used for text analytics or for...||Natural Language Processing||08.14.2017|
RiteKit Company Logo
||The RiteKit Company Logo API returns a logo based on a registered domain. The API can be used to customize interfaces based on email domains. RiteKit offers a free plan that limits calls to 1,000 a...||Social||08.07.2017|
DigitalGlobe GBDX Authentication
||The DigitalGlobe GBDX Authentication API allows you to get access to DigitalGlobe's GBDX platform. GBDX uses OAuth2 for authentication and authorization, and you will need your GBDX username,...||Big Data||05.09.2017|
DigitalGlobe GBDX Web Application
||The GBDX web application provides a way to find the tasks that meet your business needs, to define your area of interest, and to organize your work and track your progress. Google Chrome, Firefox,...||Big Data||05.09.2017|
Mashups (1)View all