PostGIS
Last updated
Last updated
PostGIS is the geospatial extension to the free open-source database application PostgreSQL. The PostGIS extension needs to be installed as part of the PostgreSQL install. This guide uses pgAdmin as a platform for managing and querying PostgreSQL databases.
Open ‘PG Admin’ and, using the menu options available, create a new database and a new schema within the database to hold the OS MasterMap Greenspace Layer data. It is recommended that you do not use the ‘public’ schema to hold the data itself.
In the example above, a database called ‘osdata’ has been created along with a schema called ‘greenspace’ into which the data will be loaded.
As the data to be loaded comes in shapefile format, there is an easy to use PostGIS plugin available within PostgreSQL to load shapefile data.
Select ‘plugins’ from the main menu followed by ‘PostGIS Shapefile and DBF Loader’
The next window allows the user firstly to view connection details and then to add files to the database. The first thing to do will be to test connection details. Click on the ‘View connection details’ button.
The resulting box should contain the username and password already entered along with the host name. The database being used to contain the data should already be selected. Click ‘OK’.
If everything is working OK, ‘Connection succeeded’ should appear in the Log Window. Click the ‘Add File’ button.
In the next window, which appears, use the file tree in the ‘Places’ box on the left-hand side to navigate to the folder in which the OS MasterMap Greenspace Layer shapefiles data sits. A list of the files will appear in the main window. It is possible to load one or all of the files into the database. In the example above, all of the shapefiles have been selected. Then, click ‘Open’.
Another window will open listing the selected shapefiles. The Schema and SRID will need to be changed. The schema will need to be changed to the schema in the database into which the data is being loaded (in this case ‘greenspace’). The SRID (or co-ordinate reference system) will need to be changed to 27700, which is the code for British National Grid. This will need to be done for all the shapefiles being loaded. No other element will need to be changed. Once this has been done click ‘Import’.
At the end of the procedure, the log window at the bottom of the PostGIS import/export manager box should indicate that all the shapefiles have loaded successfully. However, one or two of the shapefiles (depending upon the area of the country being loaded) may fail to load because the text encoding needs to be changed from UTF-8 to LATIN1. If this is the case, the user will need to close the plugin and start again selecting just the shapefiles which failed to load previously. The schema and SRID must be changed again and this time, the character encoding will need to be changed. This can be done by clicking the ‘Options’ button.
Change the DBF character encoding to LATIN1 and click ‘OK.
Changing this should allow the import to complete successfully. For information, the shapefiles which are most likely to need this change to be made are either in Wales or Scotland. This is because files in these areas may contain text which has accents which are not part of the UTF-8 character set.
Once the import has been completed, the user can check if the data is loaded properly by refreshing the schema in PGAdmin and opening the ‘Table’ tree. If the data has loaded correctly, there should be the same number of OS MasterMap Greenspace Layer data tables in the schema as the number of shapefiles opened.
The data is now loaded into the PostGIS database and is now ready to be viewed in a GIS application. As QGIS, the open-source GIS, has been developed to work seamlessly with PostGIS, we will open and view the data using that application. However, any GI application which includes support for PostGIS can be used.
In QGIS, click on the ‘Open PostGIS layer’ button on the left-hand side of the window.
If the OS MasterMap Greenspace Layer data has been placed into an existing database, as in this case, the user will simply need to open the connection to that database within QGIS. The greenspace schema should appear in the list of available schemas within that database.
If the database in which the OS MasterMap Greenspace Layer data sits is new, create a new database connection by clicking the ‘New’ button. The following window appears and the information relating to the new database will need to be entered within the appropriate boxes:
Click on the + sign next to the schema to expand the list of tables. Select all the tables within OS MasterMap Greenspace Layer that need to be loaded to QGIS.
Once all have been selected, click ‘Add’.
The OS MasterMap Greenspace Layer data will load into QGIS. The data will need to be ordered and then styled appropriately using personalised style files or the style files available from GitHub published by Ordnance Survey. . It should be noted that there is no need to add a spatial index to the data from PostGIS as those indexes were added automatically during loading data into PostgreSQL.
It’s possible to load multiple 25 km² grid tiles of data into the same schema in PostgreSQL. As the shapefiles have the 5km grid letters as a prefix in the filename, these files will go into separate tables in the schema.
Data can then be viewed across tile edges using QGIS or other GI applications which support PostGIS.
The screenshot above shows the two tiles, TR15NE and TR15NW loaded into QGIS from the greenspace schema. It should be noted that duplicate features will exist across the tile edges as the data is supplied as ‘hairy tiles’ as previously indicated.
As stated in the point above, if using multiple tiles of data in PostGIS, loading them as described, some features will be replicated across tile edges loaded in different tables of the same features, e.g. in the case of TR15NW and TR15NE. If the data is being used for contextual purposes only, this should not be an issue for the user. However, if the data is being used for any kind of analysis involving counts of features, these duplicates will need to be removed to avoid providing inaccurate results.
It is possible to remove these features using SQL commands in PostgreSQL itself.
Firstly, create a merged file containing the area required using the merge shapefile feature in QGIS documented earlier. In this example, we are going to use the merged shapefile for TR15NW and TR15NE that was made previously and then load it into PostgreSQL using the shapefile loader plugin.
In the example above, an additional table, greenspace area, has been added to the greenspace schema in PostgreSQL. Open the SQL window in PostgreSQL and type in the following command;
The command returns the following result:
This shows that the number of features detected is 34 031, in this example. The following command should now be typed into the SQL window:
The above command creates a new table in the schema with all the duplicate features removed. This can be verified by typing in the following command:
It can be seen from running this query that the number of features in the newly created table is less than in the original merged table. This indicates that the duplicate features along the tile edges have been removed. It will now be possible to load the dissolved table into QGIS and carry out the required analysis.
An alternative way to do what has been described above would be to merge the required shapefiles together and de-duplicate using QGIS as described earlier in this document. The user will then have a set of de-duplicated shapefiles which can then be loaded into PostgreSQL/PostGIS and displayed in QGIS using the methods described previously.
It is possible to load the GML supply data into PostgreSQL using sets of SQL commands, as there is no GUI PostGIS loader for GML data. These SQL commands would create the tables, indexes and load the data. As this data is supplied in shapefile format which can be loaded using the PostGIS shapefile loader plugin, the SQL method of loading the GML data will not be described in this guide.