Programster's Blog

Tutorials focusing on Linux, programming, and open-source

PostGIS - Importing Spatial Data

Once you have PostgreSQL set up with the PostGIS extension, you are probably keen to perform some spatial queries against some real world data. You could start building tools to create the geometry for all the countries in the world to get the data into your database, but it's much easier to just import data that other people have already gone through the effort of creating. This tutorial will take you through importing country geometry (shapefile) data from naturalearthdata.com. You can use these steps for importing shapefile data from any source.

This tutorial is based upon Step 4 in How to Install and Configure PostGIS on Ubuntu 14.04.

Steps

Before we begin, we must install some tools that we will need to make use of. We will be downloading a zip file which we will need to unzip, and we will be using GDAL to convert/import the data. GDAL is the Geospatial Data Abstraction Library](http://www.gdal.org/).

sudo apt-get install gdal-bin unzip

Now lets download some spatial data for importing. In this case we will be fetching data that provides information about countries.

mkdir $HOME/nedata
cd $HOME/nedata
wget http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip
unzip ne_110m_admin_0_countries.zip
rm ne_110m_admin_0_countries.zip

We will have extracted the following files:

  • ne_110m_admin_0_countries.README.html
  • ne_110m_admin_0_countries.VERSION.txt
  • ne_110m_admin_0_countries.cpg
  • ne_110m_admin_0_countries.dbf
  • ne_110m_admin_0_countries.prj
  • ne_110m_admin_0_countries.shp
  • ne_110m_admin_0_countries.shx

The .dbf, .prj, and .shp files make up a ShapeFile. We can load this into our PostgreSQL database.

Convert the Shapefile that you got from Natural Earth into a PostGIS table using ogr2ogr, like this (making sure to fill in your database settings appropriately):

DB_NAME=postgis_test
DB_PORT=5432
DB_USER=programster
DB_PASSWORD="thisismypassword"
DB_HOST=postgresql.programster.org

ogr2ogr \
  -f PostgreSQL \
  PG:"dbname=$DB_NAME host=$DB_HOST port=$DB_PORT user=$DB_USER password=$DB_PASSWORD" \
  -progress \
  -nlt PROMOTE_TO_MULTI \
  $HOME/nedata/ne_110m_admin_0_countries.shp
  • -f PostgreSQL tells the command that the output file should be a PostgreSQL table.
  • -nlt PROMOTE_TO_MULTI is used to tell the tool to promote Polygons to MultiPolygons. This is because both cannot be put into the same field, so we need to have the polygons "promoted" to multipoloygons, so they are all the same type which can.

Make sure the database0...10...20...30...40...50...60...70...80...90...100 - done. has already been created, and is UTF8, not LATIN1 (refer to cheatsheet under "create database").

If all went well, you should see the following output as it progresses.

0...10...20...30...40...50...60...70...80...90...100 - done.

Testing

We can now log into the database to have a play with the data.

psql \
  --user $DB_USER \
  --password \
  -d $DB_NAME \
  --host $DB_HOST

List the tables with:

\dt 
                    List of relations
 Schema |           Name            | Type  |    Owner    
--------+---------------------------+-------+-------------
 public | ne_110m_admin_0_countries | table | programster
 public | spatial_ref_sys           | table | postgres
(2 rows)

If you don't see the spatial_ref_sys table, then you didn't create the postgis extension which you need. You will need to drop the database, recreate it and create the postgis extension. Then redo the import.

List the columns in the ne_110_admin_0_countries table:

\d ne_110m_admin_0_countries
                                         Table "public.ne_110m_admin_0_countries"
    Column    |          Type          | Collation | Nullable |                          Default                           
--------------+------------------------+-----------+----------+------------------------------------------------------------
 ogc_fid      | integer                |           | not null | nextval('ne_110m_admin_0_countries_ogc_fid_seq'::regclass)
 wkb_geometry | bytea                  |           |          | 
 featurecla   | character varying(15)  |           |          | 
 scalerank    | numeric(1,0)           |           |          | 
 labelrank    | numeric(1,0)           |           |          | 
 sovereignt   | character varying(32)  |           |          | 
 sov_a3       | character varying(3)   |           |          | 
 adm0_dif     | numeric(1,0)           |           |          | 
 level        | numeric(1,0)           |           |          | 
 type         | character varying(17)  |           |          | 
 admin        | character varying(35)  |           |          | 
...

The field we are really interested in is the wkb_geometry field, because it is the one that is of type geometry(MultiPolygon,4326), thus it is the one with all of our geometry data.

Now we can perform some spatial queries. The following query will find the center point of all of the countries before returning the Y coordinate of the point as the latitude. It will then order by this DESC and limit to the first 10, resulting in us getting the 10 most northerly countries in the world.

SELECT admin, ST_Y(ST_Centroid(wkb_geometry)) as latitude 
FROM ne_110m_admin_0_countries 
ORDER BY latitude DESC 
LIMIT 10;

This should be your result:

   admin   |     latitude     
-----------+------------------
 Greenland | 74.7704876939899
 Norway    | 69.1568563097536
 Iceland   |  65.074276335291
 Finland   | 64.5040940396365
 Sweden    | 62.8114849680803
 Russia    | 61.9616634949229
 Canada    | 61.4690761453489
 Estonia   | 58.6436954266309
 Latvia    | 56.8071751342792
 Denmark   | 56.0639344617945