Instructions to install/load into a database application

Pre-Requisites to using the Linked ID data

It is highly recommended that the Linked ID data be loaded into a database application for use. It is not possible to illustrate within this guide the loading of these datasets into every database application available. Therefore, users should familiarise themselves with the operations of both loading the data and then indexing it so that performance can be improved.

For the purposes of this guide, an example will be shown illustrating the loading of the Road_TOID_Street_USRN Linked ID CSV file into a PostgreSQL (Open Source) database.

Preparation

In this example version 11 of PostgreSQL is being used. It is expected that a later release (version 12) should also behave in a similar manner. Pg_Admin 4 or any other free database manager can be used to carry out the procedures. Pg_Admin 4 is the database manager front-end application installed with PostgreSQL, and PostGIS for tables with geometry, will be used to illustrate the procedures.

Create Schema

It is assumed that the end user has already created a database within Postgres to hold other data. The other data held by the user will be held in different schemas. It is recommended that a new schema be created to hold the Linked ID data, especially if the user intends to use most or all the Linked IDs available.

In Pg_Admin right-click on Schemas under the database listing
Schemas listed under the database listing in the left hand side menu
Select ‘Create Schema’.
Create schema window showing schema name and owner
Give the schema a name in the Name box.
Click ‘Save’

The new schema now appears in the listing of schemas within the database.

New schema shown in the database explorer menu

Create Table

The next procedure will show you how to create a table within the schema to hold the Linked ID data for the individual Linked ID that the user is seeking to load. The table will need to have columns created corresponding to the fields within the Liked ID .CSV file. The user will either need to use an advanced text editor (e.g. UltraEdit) to view the CSV data, or refer to the technical specification document to see what fields are contained within the Linked ID data, and the what are the data types within the fields. In this example, we will define all the fields within the Linked ID data as text fields.

Using UltraEdit Text editor, which can open very large CSV files, we can view the column headings within the CSV file. The example below shows the opening of the Road_Toid_street_USRN_10.CSV file, which is the example Linked ID file we will be using to illustrate the loading procedure in this guide.
CSV file showing column headings displayed in the text editor
We have identified the column headers and can now use a script like the one below which will carry out the table creation operation. This includes both dropping any pre-existing table of the same name, and then creating a table to hold the data.

Please note this code will need to be modified to account for the variables in the user's own environment, but the column names and data types will remain the same.

DROP TABLE IF EXISTS linkedids.road_toid_street;
CREATE TABLE linkedids.road_toid_street (
correlation_id text, 
identifier_1 text, 
version_number_1 text, 
version_date_1 text, 
identifier_2 text, 
version_number_2 text, 
version_date_2 text, 
confidence text
);

In Pg_Admin, under ‘Tools’, select ‘Query Tool’ to open a SQL query window

Query tool selection in the Tools menu for opening a new SQL query window

Copy and paste the code from a Text file (recommended) into the SQL window.

Create table SQL code shown pasted into the SQL window

When satisfied, click the ‘run’ button to execute the query.

The query should now run, creating the table, and loading the data into the created table. A success message should appear when the query has run.

Loading the data

In Pg_Admin, under the LinkedIDs schema a new table should now be visible.

road_toid_street table now visible under the tables heading

In this example the new table is called road_toid_street. We will now use the import data function within Pg_Admin_4 to load the data into the created table.

Right-Click on the road_toid_street table just created and select ‘import/export’.

In the window that appears, select the import/export to 'import'.
Select the filename of the CSV file which is being imported.
Set the encoding to UTF-8.
Set the header box to ‘Yes’ because we know that the CSV file contains header information.
Select the delimiter to comma.
Click ‘OK when finished. The data will now load into the created table.

Viewing/Indexing the data

Right-clicking on the road_toid_street table and viewing the top 100 rows shows a result similar to that shown below. This indicates that the Linked ID data for the Road_Toid_street_USRN_10.CSV has been successfully loaded into the database.

Data output shown for road_toid_street table

The data should now be indexed, which can be done by adding an index to one of the fields in the data or by adding OIDs as unique IDs to the data. This can be achieved by typing in a code string like the one shown below into the query window and then running the query.

Alter table query table shown in the query editor window of PostgreSQL 11

Adding this new field will enable any joining and querying to be achieved much more quickly. Looking at the data table now, we can see that PostgreSQL has added a unique OID field to the data.

Other database applications will have similar methods for the creation and loading of data tables. The data is now ready for use and table-joins can now be made with the end user's own data held either within a database or outside the database in a GI application

PreviousLinked identifiers NextOS Open Linked Identifiers Technical Specification

Last updated 10 months ago

Was this helpful?