Joining Data with the Placekey QGIS Plugin

Placekey is a free address and point-of-interest (POI) matching tool that is now integrated with QGIS via the Placekey Connector Plugin. Placekey does the work of address and POI normalization, validation, and geocoding behind-the-scenes in order to generate a unique Placekey for every place in your dataset.

A Common Problem: Joining Data from Different Sources

Let’s assume you have data regarding some POIs:

  • State & Local Government Open Data (Address Points, etc)
  • Business Reviews
  • Real Estate data (rent, property value, tenants, etc)
  • Environmental information

and you would like to join these features with information for the same assets from another source. The most obvious approaches are:

  • Joining by attribute (like a unique identifier or ID)
  • Spatial join: creating a buffer around the points in one dataset and joining intersecting features from the other

Both approaches have downsides. The first method may suffer due to spelling errors, different aliases referring to the same POI, or simply the absence of a common identifier. The second method can be quite difficult as assets might be closer to each other than expected. This would create either misleading joins or, worse still, a many-to-one relationship. So we are dealing with uncertainty here.

An example: note where city, lat, and lon differ between the two layers below: 

different cities, different coordinates, same place in St. Louis
different cities, different coordinates, same place in St. Louis

In this example, the first method (joining on a common attribute) would have mismatched the Costco in Missouri because the cities are listed as ‘SAINT LOUIS’ and ‘Concord’, respectively. 

The second method (spatial join) might have worked if the buffer was set at the correct size, but it could have failed if this dataset contained POIs which were closer together. For example, a small Starbucks adjacent to the large Costco might have been mismatched by a spatial join. Another example of where spatial joins can fail is for POIs contained within other geometries, such as stores within shopping malls and individual units within apartment buildings.

Placekey

Placekey’s goal is to create a standard for identifying physical places. It works by assigning every POI a unique identifier which indicates the location by a “WHERE” part (indicated by an H3 hexagon) indicates the place itself by a “WHAT” part, which contains encodings of the address and the POI.

The WHERE and WHAT part of a placekey
A short explanation of the Placekey encoding

In practice, this means the following variations might occur:

  • Data records referring to the same POI using different names will get the same Placekey
  • Data records referring to the same POI using different addresses will get the same Placekey
  • Similar but distinct addresses will receive different Placekeys (doesn’t over-merge)
  • Different POIs at the same address will receive different Placekeys

The main advantages are:

  • common identifier across organizations
  • free of use, free to store
  • open specification

But unfortunately this system is currently only fully supported in the US.

Yet the advantages of this API are quite unique compared to closed systems like what3words and Google Place ID:

Compared to Plus Codes Placekey will not only indicate the spatial information but also the WHAT part.

The QGIS Python Plugin

I wrote the Placekey Connector plugin to facilitate the use of Placekey for joining disparate datasets. The plugin is a processing tool. Therefore we can combine it with different workflows inside QGIS. Along with the processing tool I have created a lookup dockwidget which offers you the option to query single placekeys directly on the map, as shown below:

plugin and processing tool of placekey
The processing Plugin to the right and the dockwidget at the bottom

In short: it will take the features of a layer/table/file and send the desired fields as payload to the bulk API endpoint. The plugin is doing it by packages of max. 100 features per request. Additionally, the plugin offers you the option to either copy the original feature and simply add the Placekey, or just return the feature id and the resulting Placekey.

The main processing tool
The main processing tool

Once done, the new layer is part of your layer list. You can store it and/or use it for future joins with other data providers later on:

If you want to use the plugin, simply get yourself an API key at placekey.io and save it to your QGIS installation. Once done, enjoy your data with added placekeys. Please try to report all issues on the issue tracker. If you are interested to use Placekey in other “clients” you should check the growing number of integrations like R, Python, Snowflake, Google Spreadsheet

If you’re still reading this and asking yourself… where is the magic?! Let me share this minimal example with you:

An example using raw python

You can use the Placekey API quite easily: provide a placename and an idea of the location and you will get a placekey for this POI. The location information can be one of:

  • lat, lon
  • street_address, city, region, country
  • street_address, region, postal_code, country

The use of the placekey.io API is straightforward. You can download the sample dataset here:

the CSV of costco locations
the sample data
import csv
reader = csv.DictReader(open("/Users/riccardoklinger/Desktop/costco_small.csv", 'r'))
dict_list = []
for line in reader:
    dict_list.append(line)

Once we have the dict from our csv we can prepare the payload:

payload = {"queries": []}
for item in dict_list:
    payload["queries"].append({"street_address": item["street"],
                               "region": item["state"], 
                               "city":item["city"], 
                               "iso_country_code": "US", 
                               "location_name":item["shop"]})

As we do have multiple features we will use the bulk API endpoint to send all data with one request to the API. The received results are as follows:

import requests
url = "https://api.placekey.io/v1/placekeys"
payload = json.dumps(payload)
headers = {
  'apikey': apikey, #place your apikey here
  'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data = payload)
print(response.text.encode('utf8'))

After 1.5s we can see the result:

the resulting placekeys
resulting placekeys

If you want to try out this prototype script for yourself, you can download it here.

5 1 vote
Article Rating
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Joseph
3 years ago

Hi! I really liked your article, it was very informative. I liked the way you described the examples but I have just one more question. Which source do you find the best for collecting the GIS data?