How to Geocode User Input Postcodes

This is the third part of our series How to Add Postcode-Based Proximity Search With Open Data. In part 2, we explained three ways to geocode your data using Open Postcode Geo.

So far in this series we have seen how to geocode a dataset, which involves finding the position of each record and describing that position as coordinates. We used an example dataset of pubs, found the position of each pub, and used easting and northing as our coordinates.

Perhaps the most common reason to add location functions to mobile and web applications is to help the user find the closest business nearby: bank ATMs, gas stations, retail stores, pubs. But programmatically, in order to “find the nearest” we need to know: Nearest to what? And whom?

In this how-to article, I show how to implement the common application of nearest locations to a postcode provided by a user.

In this example application, the user provides the application with a postcode, and we provide her with the nearest pubs to that postcode. (A more complex example would use GPS or other functions to determine the current location, but let’s keep this simple.)

For the sample code  we ask the user for a postcode, validate the postcode, determine the format in which it was provided (e.g. “SW1A1AA” or “SW1A 1AA”), check the postcode is geographical, and then fetch the easting and northing. Conveniently, the Open Postcode Geo API does all of this for us.  Simply make a request to the API endpoint with the user input URL-encoded and appended to the endpoint:

http://api.getthedata.com/postcode/[input]

Where [input] is the user input.

For example:

http://api.getthedata.com/postcode/SW1A+1AA

The important fields for validation are status and match_type. A valid postcode should have status=match and match_type=unit_postcode.

The important fields for geocoding are of course easting and northing, but also positional_quality_indicator, which should be 1-6 or 8 if the postcode can be geocoded. You can read more about what these values mean in the documentation.

If a postcode cannot be geocoded, positional_quality_indicator is 9 and easting and northing are null.

Here is an example API response for a valid postcode which can be geocoded:

{
 "status": "match",
 "match_type": "unit_postcode",
 "input": "EH1 2NG",
 "data": {
   "postcode": "EH1 2NG",
   "status": "live",
   "usertype": "small",
   "easting": 325066,
   "northing": 673533,
   "positional_quality_indicator": 1,
   "country": "Scotland",
   "latitude": "55.948965",
   "longitude": "-3.201478",
   "postcode_no_space": "EH12NG",
   "postcode_fixed_width_seven": "EH1 2NG",
   "postcode_fixed_width_eight": "EH1  2NG",
   "postcode_area": "EH",
   "postcode_district": "EH1",
   "postcode_sector": "EH1 2",
   "outcode": "EH1",
   "incode": "2NG"
 },
 "copyright": [
   "Contains OS data (c) Crown copyright and database right 2016",
   "Contains Royal Mail data (c) Royal Mail copyright and database right 2016",
   "Contains National Statistics data (c) Crown copyright and database right 2016"
 ]
}

Here is an example API response for a valid postcode which cannot be geocoded:

{
 "status": "match",
 "match_type": "unit_postcode",
 "input": "WV98 2AA",
 "data": {
   "postcode": "WV98 2AA",
   "status": "live",
   "usertype": "large",
   "easting": null,
   "northing": null,
   "positional_quality_indicator": 9,
   "country": "England",
   "latitude": null,
   "longitude": null,
   "postcode_no_space": "WV982AA",
   "postcode_fixed_width_seven": "WV982AA",
   "postcode_fixed_width_eight": "WV98 2AA",
   "postcode_area": "WV",
   "postcode_district": "WV98",
   "postcode_sector": "WV98 2",
   "outcode": "WV98",
   "incode": "2AA"
 },
 "copyright": [
   "Contains OS data (c) Crown copyright and database right 2016",
   "Contains Royal Mail data (c) Royal Mail copyright and database right 2016",
   "Contains National Statistics data (c) Crown copyright and database right 2016"
 ]
}

Here is an example API response for an invalid postcode:

{
 "status": "no_match",
 "error": "No matching postcode area, postcode district, postcode sector, or unit postcode found.",
 "input": "37188",
 "copyright": [
   "Contains OS data (c) Crown copyright and database right 2016",
   "Contains Royal Mail data (c) Royal Mail copyright and database right 2016",
   "Contains National Statistics data (c) Crown copyright and database right 2016"
 ]
}

We can adapt the code we used earlier in this article series to make a request to the API in order to geocode user input:

// $user_input is the input we got from the user.
// We are expecting (but not assuming) a valid postcode.
$user_input = 'SW1A1AA';

// This is the Open Postcode Geo API endpoint.
$endpoint = 'http://api.getthedata.com/postcode/';

// $user_input should be URL encoded.
$url = $endpoint . urlencode($user_input);

// This fetches the API response and loads it into $response
// as a string of JSON.
$response = file_get_contents($url) or die('Cannot fetch ' . $url);

// Decode the JSON into an associative array.
$arr = json_decode($response, true) or die('Cannot decode ' . $response);

// Check status, match, and positional_quality_indicator to
// ensure we have a suitable response for geocoding.
if($arr['status'] == 'match' and $arr['match_type'] == 'unit_postcode' and $arr['data']['positional_quality_indicator'] != 9){
   print 'user_input:' . $user_input . "\n";
   print 'postcode:' . $arr['data']['postcode'] . "\n";
   print 'easting:' . $arr['data']['easting'] . "\n";
   print 'northing:' . $arr['data']['northing'] . "\n";
   print 'positional_quality_indicator:' . $arr['data']['positional_quality_indicator'] . "\n";
}
else {
   die('Input "' . $user_input . '" does not match valid postcode or cannot be geocoded');
}

This code should produce the output:

user_input:SW1A1AA
postcode:SW1A 1AA
easting:529090
northing:179645
positional_quality_indicator:1

So we now have the coordinates of the search centre, and need to retrieve the nearest records from our dataset to that point.

Proximity search: the maths

If you just want to build geolocation applications -- rather than understand them algorithmically -- feel free to skip this step.

An easting and northing describe a point on a grid. The maths for finding the distance between two points on a flat grid is Pythagorean Theorem: The square on the hypotenuse is equal to the sum of the squares on the other two sides.

Let’s look at an example based on the diagram:

Source: https://www.getthedata.com/images/pythagoras.png

Point A has an easting of 2 and a northing of 3.

Point B has an easting of 8 and a northing of 7.

The distance from A to B is represented by line D3.

D3 forms the hypotenuse of a right-angled triangle, whose other two sides are represented by the lines D1 and D2 respectively. Therefore the square of the length of the hypotenuse D3 is the sum of the squares of the lengths of D1 and D2 respectively.

The length of D1 is the difference between the two eastings: 8 - 2 = 6.

The length of D2 is the difference between the two northings: 7 - 3 = 4.

Therefore the square of the length of D3 is (62 + 42) = (36 + 16) = 52.

Therefore the length of D3 (distance from A to B) is √52 = 7.2111.

We can write this in SQL as follows:

sqrt(pow(abs([easting_a] - [easting_b]),2) + pow(abs([northing_a] - [northing_b]),2))

Where [easting_a] and [northing_a] are one set of coordinates, and [easting_b] and [northing_b] are the other set.

Note that by using MySQL’s abs() function to get the absolute value, it doesn’t matter which way round A and B’s coordinates are in the SQL.

In this article we learned how to take a postcode supplied by the user, submit it to the Open Postcode Geo API, and retrieve the easting and northing for that postcode. In other words, we geocoded the postcode supplied by the user. We also looked at Pythagorean Theorem, the maths which explains how to calculate the distance between two points.

In the next article we bring everything together into a functional application. We take a geocoded dataset and geocoded user input, and find the closest records in the dataset to the position of the user input.

This is part 3 of our series How to Add Postcode-Based Proximity Search With Open Data. In part 4, we will show how to take a geocoded dataset and a postcode, and order the dataset by proximity to the postcode.

Dan Winchester

Comments