Getting Started with OpenStreetMap Data

Dec 27, 2022
Openstreetmap

What is the OpenStreetMap Data

The OpenStreetMap (OSM) data is a global community generated collection of map data. Everything from natural features like coastlines, rivers, and forests to human-made features like roads, and buildings have been mapped in the data.

OpenStreetMap is used by a number of games and services that require map features for example Pokemon Go, Strava, and my little creation Roman Roads of Britain.

How to Get the OSM Data

There are a few options for downloading the OSM data depending on what you need it for:

  • OSM export tool - extract everything in a small area.
  • overpass-turbo - extract a subset of features, e.g. all roads, by querying the data.
    • This website is also very handy for quickly querying the data and visualising it. For example, the query waterway=river will visualise all rivers.
  • Geofabrik extract everything in a large area, for example a country.

Importing OSM Data into a Postgres (with PostGIS) Database

Importing the data into a database allows you to conveniently query for features of a particular type or from a certain area.

The tool osm2pgsql cleans up and imports a provided osm file into a Postgres database.

To import a file (e.g. british-isles.osm) into the Postgres database osmdb running locally run the command below.

osm2pgsql -d osmdb -U postgres -W -H localhost -P 5432 british-isles.osm

Osm2pgsql creates tables:

  • planet_osm_point: point features e.g. a bus stop
  • planet_osm_line: line features e.g. river waterway
  • planet_osm_polygon: area features e.g. the coastline of Great Britain

The geometry itself is stored in a column called way. Along with the geometry data osm2pgsql imports metadata into columns such name, place, etc.

Running Postgres GIS using Docker Compose

It is possible to install and run Postgres with PostGIS locally however Docker provides a much easier way of achieving this. The docker-compose.yaml file below specifies an instance on Postgres with PostGIS.

version: "3.3"
services:
  db:
    image: postgis/postgis
    environment:
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=osmdb
    ports:
      - "5432:5432"
    volumes:
      - ./postgres-data:/var/lib/postgresql/data

To start the database run:

docker-compose up -d

How to Query the Data

Querying the OSM tables is the same as any other SQL query for example select way from planet_osm_polygon where ... the complexity is with interpreting the result, as the geometry isn’t a simple int or string. The PostGIS geometry types (point, line, etc) are stored in a hex-encoded binary format, called well-known binary (WKB). The shapely library provides the function shapely.wkb.loads(...) to load the wkb format.

Here is a full example of connecting to a Postgres database running locally, querying for all islands, parsing the geometry, and rendering it. The full example can be found here: https://github.com/minibuildsio/osm-python-example.

import psycopg2
import shapely.wkb
import matplotlib.pyplot as plt


def extract_polygons(cursor, query):
  cursor.execute(query)
  ways = cursor.fetchall()

  return [shapely.wkb.loads(way[0], hex=True) for way in ways]


def plot_polygons(polygons):
  for polygon in polygons:
    plt.plot(*polygon.exterior.xy)

  plt.show()


conn = psycopg2.connect(host="localhost", database="osmdb",
    user="postgres", password="postgres")

cur = conn.cursor()

query = 'select way from planet_osm_polygon where place = \'island\''
islands = extract_polygons(cur, query)
plot_polygons(islands)

Notes

Coordinate Reference System (CRS)

osm2psql converts the OSM geometry to Web Mercator 3857 however sometimes you want WGS 84 4326 (i.e. standard GPS lat/lng) PostGIS provides the function st_transform(geometry, crs) to a different coordinate reference system (CRS) e.g. select st_transform(way, 4326) from planet_osm_polygon where ....

Osm2pgsql Configuration

Often you only need some of the information associated with an OSM entry. For example, if you are interested in extracting river waterways you’ll want waterway, natural, etc but not horse bridal ways. This can be configured by adding/removing lines from default.style.

Tools for Merging Multiple OSM Files

Osm2pgsql doesn’t appear to provide a simple way to import multiple files into the database. Instead, the most convenient way importing multiple files I’ve found is to merge the files into one merged file and import that.

Osmosis is a tool that merges a list of osm files into a single file. To merge the files great-britain-latest.osm, ireland-and-northern-ireland-latest.osm, and isle-of-man-latest.osm into a single file called british-isles.osm execute:

osmosis --rx great-britain-latest.osm --rx ireland-and-northern-ireland-latest.osm \
    --rx isle-of-man-latest.osm --merge --merge --wx british-isles.osm

Note: the double --merge is not a mistake osmosis expects n-1 --merges when merging n files. You’’l get a very cryptic error message if you don’t provide the correct number of --merges.

There are other tools for example osmium where do the same job but I have not tried.

Structure of the Data

Although I’d not recommended directly working with the OSM data in its raw format here are some notes about it’s structure. The OSM data is formatted as XML contain the following tags:

  • <node>: A point with coordinates (i.e. latitude and longitude).
  • <way>: A collection of nodes that represent a line-like feature for example a road, a river, the outline of a field, etc. Ways can be open (the end and beginning are different) or closed (a loop).
  • <relation>: A collection of ways that are related either geographically or logically for example an island in a lake.
  • <tag>: A key value pair that add some information to the entry e.g. feature type, name, etc. Nodes, ways and relations can have tags.