Getting Started with OpenStreetMap Data
What is the OpenStreetMap Data
The OpenStreetMap (OSM) data is a global community generated collection of map data. Everything from natural features like coastlines, rivers, and forests to human-made features like roads, and buildings have been mapped in the data.
OpenStreetMap is used by a number of games and services that require map features for example Pokemon Go, Strava, and my little creation Roman Roads of Britain.
How to Get the OSM Data
There are a few options for downloading the OSM data depending on what you need it for:
- OSM export tool - extract everything in a small area.
- overpass-turbo - extract a subset of features, e.g. all roads, by querying the data.
- This website is also very handy for quickly querying the data and visualising it. For example, the query waterway=river will visualise all rivers.
- Geofabrik extract everything in a large area, for example a country.
Importing OSM Data into a Postgres (with PostGIS) Database
Importing the data into a database allows you to conveniently query for features of a particular type or from a certain area.
The tool osm2pgsql cleans up and imports a provided osm file into a Postgres database.
To import a file (e.g. british-isles.osm
) into the Postgres database osmdb
running locally run the command below.
osm2pgsql -d osmdb -U postgres -W -H localhost -P 5432 british-isles.osm
Osm2pgsql creates tables:
planet_osm_point
: point features e.g. a bus stopplanet_osm_line
: line features e.g. river waterwayplanet_osm_polygon
: area features e.g. the coastline of Great Britain
The geometry itself is stored in a column called way
. Along with the geometry data osm2pgsql imports metadata into columns such name, place, etc.
Running Postgres GIS using Docker Compose
It is possible to install and run Postgres with PostGIS locally however Docker provides a much easier way of achieving this. The docker-compose.yaml
file below specifies an instance on Postgres with PostGIS.
version: "3.3"
services:
db:
image: postgis/postgis
environment:
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=osmdb
ports:
- "5432:5432"
volumes:
- ./postgres-data:/var/lib/postgresql/data
To start the database run:
docker-compose up -d
How to Query the Data
Querying the OSM tables is the same as any other SQL query for example select way from planet_osm_polygon where ...
the complexity is with interpreting the result, as the geometry isn’t a simple int or string. The PostGIS geometry types (point, line, etc) are stored in a hex-encoded binary format, called well-known binary (WKB). The shapely library provides the function shapely.wkb.loads(...)
to load the wkb format.
Here is a full example of connecting to a Postgres database running locally, querying for all islands, parsing the geometry, and rendering it. The full example can be found here: https://github.com/minibuildsio/osm-python-example.
import psycopg2
import shapely.wkb
import matplotlib.pyplot as plt
def extract_polygons(cursor, query):
cursor.execute(query)
ways = cursor.fetchall()
return [shapely.wkb.loads(way[0], hex=True) for way in ways]
def plot_polygons(polygons):
for polygon in polygons:
plt.plot(*polygon.exterior.xy)
plt.show()
conn = psycopg2.connect(host="localhost", database="osmdb",
user="postgres", password="postgres")
cur = conn.cursor()
query = 'select way from planet_osm_polygon where place = \'island\''
islands = extract_polygons(cur, query)
plot_polygons(islands)
Notes
Coordinate Reference System (CRS)
osm2psql converts the OSM geometry to Web Mercator 3857 however sometimes you want WGS 84 4326 (i.e. standard GPS lat/lng) PostGIS provides the function st_transform(geometry, crs)
to a different coordinate reference system (CRS) e.g. select st_transform(way, 4326) from planet_osm_polygon where ...
.
Osm2pgsql Configuration
Often you only need some of the information associated with an OSM entry. For example, if you are interested in extracting river waterways you’ll want waterway, natural, etc but not horse bridal ways. This can be configured by adding/removing lines from default.style
.
Tools for Merging Multiple OSM Files
Osm2pgsql
doesn’t appear to provide a simple way to import multiple files into the database. Instead, the most convenient way importing multiple files I’ve found is to merge the files into one merged file and import that.
Osmosis is a tool that merges a list of osm files into a single file. To merge the files great-britain-latest.osm
, ireland-and-northern-ireland-latest.osm
, and isle-of-man-latest.osm
into a single file called british-isles.osm
execute:
osmosis --rx great-britain-latest.osm --rx ireland-and-northern-ireland-latest.osm \
--rx isle-of-man-latest.osm --merge --merge --wx british-isles.osm
Note: the double
--merge
is not a mistake osmosis expectsn-1
--merge
s when mergingn
files. You’’l get a very cryptic error message if you don’t provide the correct number of--merge
s.
There are other tools for example osmium where do the same job but I have not tried.
Structure of the Data
Although I’d not recommended directly working with the OSM data in its raw format here are some notes about it’s structure. The OSM data is formatted as XML contain the following tags:
<node>
: A point with coordinates (i.e. latitude and longitude).<way>
: A collection of nodes that represent a line-like feature for example a road, a river, the outline of a field, etc. Ways can be open (the end and beginning are different) or closed (a loop).<relation>
: A collection of ways that are related either geographically or logically for example an island in a lake.<tag>
: A key value pair that add some information to the entry e.g. feature type, name, etc. Nodes, ways and relations can have tags.