PostGIS
Last updated
Last updated
PostGIS is the geospatial extension to the free open source database application PostgreSQL. The PostGIS extension needs to be installed as part of the PostgreSQL install. Instructions of how to do this can be found on the OS Web Site:
Open ‘PG Admin’ from the Windows desktop and, using the menu options available, create a new database and a new schema within the database to hold the OS Open Greenspace data. It is recommended that the user not use the ‘public’ schema to hold the data itself.
In the example above, a database called ‘osopendata’ has been created along with a schema called ‘open_greenspace’ into which the data will be loaded.
As the data to be loaded comes in shapefile format, there is an easy to use PostGIS plugin available within PostgreSQL to load shapefile data.
Select ‘plugins’ from the main menu followed by ‘PostGIS Shapefile and DBF Loader’
The next window allows the user firstly to view connection details and then to add files to the database. The first thing to do will be to test connection details. Click on the ‘view connection details’ button.
The resulting box should contain the username and password already entered along with the host name. The database being used to contain the data should already be selected. Click ‘OK’
If everything is working OK, ‘connection succeeded’ should appear in the Log Window. Click the ‘Add File’ button.
In the next window, which appears, use the file tree in the ‘Places’ box on the left-hand side to navigate to the folder in which the OS Open Greenspace shapefiles data sit. A list of the files will appear in the main window. The user can load one or all the files into the database. In the example above, all the shapefiles have been selected. Then, click ‘Open’. If opening files from multiple 100 x 100km grid tiles, it is better to place the original shapefiles into a single folder.
Another window will open listing the selected shapefiles. The Schema and SRID will need to be changed. The schema will need to be changed to the schema in the database into which the data is being loaded (in this case ‘open_greenspace’). The SRID (or co-ordinate reference system) will need to be changed to 27700, which is the code for British National Grid. This will need to be done for all the shapefiles being loaded. No other element will need to be changed. Once this has been done click ‘Import’.
At the end of the procedure, the log window at the bottom of the PostGIS import/export manager box should indicate that all the shapefiles have loaded successfully. However, one or two of the shapefiles (depending on the area of the country being loaded) may fail to load because the text encoding needs to be changed from UTF-8 to LATIN1. If this is the case, the user will need to close the plugin and start again selecting just the shapefiles which failed to load previously. The schema and SRID must be changed again and this time, the character encoding will need to be changed. This can be done by clicking the ‘options’ button;
Change the DBF character encoding to LATIN1 and click ‘OK’.
Changing this should allow the import to complete successfully. For information, the shapefiles which are most likely to need this change to be made are either in Wales or Scotland. This is because files in these areas may contain text which may have accents within them which are not part of the UTF-8 character set.
Once the import has been completed, the user can check if the data is loaded properly by refreshing the schema in PGAdmin and opening the ‘table’ tree. If the data has loaded correctly, there should be the same number of OS Open Greenspace data tables in the schema as the number of shapefiles opened.
The data is now loaded into the PostGIS database and is now ready to be viewed in a GIS application. As QGIS, the open-source GIS, has been developed to work seamlessly with PostGIS, we will open and view the data using that application. However, any GI application which includes support for PostGIS can be used.
In QGIS, click on the ‘open PostGIS layer’ button on the left-hand side of the window.
If the OS Open Greenspace data has been placed into an existing database, as in the example below, the user will simply need to open the connection to that database within QGIS. The open_greenspace schema should appear in the list of available schemas within that database.
If the database in which the OS Open Greenspace data sits is new, create a new database connection to the database by clicking the ‘new’ button. The following window appears and the information relating to the new database will need to be entered within the appropriate boxes:
One the connection has been made, click on the + sign next to the schema to expand the list of tables. Select all the tables within OS Open Greenspace that need to be loaded to QGIS.
Once all have been selected, click ‘Add’.
The OS Open Greenspace data will load into QGIS. The data will need to be ordered and then styled appropriately using personalised style files or the style files available from GitHub published by Ordnance Survey. If using these published files, please consult the accompanying ‘Quick Start Guide’. It should be noted that there is no need to add a spatial index to the data from PostGIS as those indexes were added automatically during the loading of the data into PostgreSQL.
It is possible to load multiple 100 x 100km grid tiles of data into the same schema in PostgreSQL. As the shapefiles have the 100km grid letters as a prefix in the filename, these files will go into separate tables in the schema. It will then be possible to view data across tile edges using QGIS or other GI applications which support PostGIS.
The screenshot above shows the access points and greenspace sites for two tiles, TG and TF loaded into QGIS from the greenspace schema. It should be noted that duplicate features will exist across the tile edges as the data is supplied as ‘hairy tiles’ as previously indicated.
As stated in the point above, if using multiple tiles of data in PostGIS, loading them as described, some features will be replicated across tile edges loaded in different tables of the same features, e.g. in the case of TF and TG. If the data is being used for contextual purposes only, this should not be an issue for the user.
However, if the data is being used for any kind of analysis involving counts of features, these duplicates will need to be removed to avoid providing skewed results.
It is possible to remove these features using SQL commands in PostgreSQL itself.
Firstly, create a merged file containing the area required, using the merge shapefile feature in QGIS documented earlier. In this example, we are going to use the merged shapefile for TF and TG that was made previously and then load it into PostgreSQL using the shapefile loader plugin.
In the example above, two additional tables, merged_greenspace_sites and merged_access_points, have been added to the open_greenspace schema in PostgreSQL. Open the SQL window in PostgreSQL and type in the following command:
The command returns the following result:
This shows that the number of features detected is 4,473, in this example. The following command should now be typed into the SQL window;
The above command creates a new table called greenspacesites_dissolved in the schema with all the duplicate features removed. This can be verified by typing in the following command;
An alternative way to do what has been described above would be to merge the required shapefiles together and de-duplicate using QGIS as described earlier in this document. The user will then have a set of de- duplicated shapefiles which can then be loaded into PostgreSQL/PostGIS and displayed in QGIS using the methods described previously.
It is possible to load the GML supply data into PostgreSQL using sets of SQL commands, as there is no GUI PostGIS loader for GML data. These SQL commands would create the tables, indexes and load the data. As this data is supplied in shapefile format which can be loaded using the PostGIS shapefile loader plugin, the SQL method of loading the GML data will not be described in this guide.