1 of 23

AddressBase

AddressBase provides a current view of all Royal Mail Postcode Address File (PAF) addresses that have been matched to the National Land and Property Gazetteer (NLPG) and One Scotland Gazetteer (OSG). The product provides Royal Mail attribution as well as enhancing the Royal Mail Postcode Address File (PAF) data with X and Y coordinates on the British National Grid and ETRS89 coordinate reference system and providing the classification of an address to a primary level. It also provides a primary level classification.

This product will provide you with a single view of an address, allow you to locate this address on a map to give you a geographic view and carry out primary analysis on the function of the address to determine, for example, residential from commercial properties.

This product is updated every six weeks.

Please see the AddressBase Getting Started Guide for instructions on how to load and work with AddressBase data; this is a composite guide for AddressBase, AddressBase Plus and AddressBase Plus Islands.

Precision identification

Identify a property and locate it on a map with precision – using the X and Y coordinates we’ve assigned to the Royal Mail Postcode Address File (PAF) data.

Understand the nation with PAF

Every address geo data record provides Royal Mail address information from PAF, the Unique Delivery Point Reference Number (UDPRN) and X and Y coordinates.

Track customer data

Using the basic classifications in AddressBase you can quickly filter between residential and commercial addresses, which is ideal for marketing purposes.

Reduce marketing costs

Our basic classifications – residential or commercial – will reduce the costs and increase the effectiveness of your direct marketing.

Confidence in data

In customer services, the need for accuracy is paramount. Have confidence in your front-line staff’s ability to look up addresses on a database of millions, quickly and efficiently.

Access: Download
Data theme: Address
Data structure: Vector – Points
Coverage: Great Britain
Scale: 1:1 250 to 1: 10 000
Format: CSV, GML 3.2.1
Ordering area: All of Great Britain or customisable area (5km² tiles or user-defined polygon)
OS Data Hub plan: Public Sector Plan, Premium Plan, Energy & Infrastructure Plan

House to flat conversions

To find out details such as houses that have been converted to flats, you'll need AddressBase Plus. it includes current properties and addresses sourced from local authorities, Ordnance Survey and Royal Mail. These are all matched to the UPRN and structured in a flat-file model.

AddressBase Plus has more records than AddressBase as it includes objects without postal addresses such as places of worship and community centres – as well as sub-divided properties. It lets you locate an address or property on a map, through the assigned X and Y coordinates.

Crucially, the cross-referencing information with OS MasterMap products via Topographic Identifiers (TOIDs) means you can view address data within a wider context.

Paying Royal Mail royalties as a commercial customer

Royal Mail royalties are included in the licence fee. A separate Royal Mail royalty fee applies if you license the AddressBase data on External Transaction Solution (ETS) terms.

File size

If the file size of your order is smaller than 2Gb, you can get it from our FTP server. in addition, public sector customers can download 5km chunk orders via our download service.

Data from Royal Mail’s PAF

The database is a vital component of the single address gazetteer database and is in each of the AddressBase products where there has been a match confirming the address to the LLPG address.

How to get this product

Access to this product is free for Public Sector Geospatial Agreement (PSGA) Members. Find out if you are a PSGA Member or download a sample of AddressBase data by accessing the AddressBase product page of the OS website, which has links to all of the relevant resources. Alternatively, you can try out the full product by applying for a Data Exploration license.

Visualise AddressBase data online

What's next?

To access additional documentation and resources relating to this product, please refer to the following:

New users should start with the Fundamentals pages to gain high-level insight into AddressBase products. The Getting Started Guide will help you to begin using product data in different software systems. The Technical Specification contains detailed technical insights.

AddressBase Getting Started Guide

This getting started guide provides instructions for using three AddressBase products in different software applications. Users with limited technical knowledge will be able to follow this guide.

These instructions show you how to get started with AddressBase, AddressBase Plus and AddressBase Plus Islands.

AddressBase products are created by bringing together different address sources:

Local Authority Gazetteers across Great Britain, Northern Ireland, the Channel Islands and the Isle of Man
Royal Mail PAF data
References to Valuation Office Agency (VOA) data
Additional addresses and coordinates from Ordnance Survey

The data is supplied as comma-separated values (CSV) or Geography Markup Language (GML).

This getting started guide shows you how to obtain a data supply, load and work with AddressBase data. It includes the following sections:

Prerequisites
Data supply
Working with CSV data
Working with GML data
Working with COU data
Creating a single-line or multi-line address
Searching for addresses

Prerequisites

System requirements

AddressBase data is an addressing gazetteer that can be used within GIS and database systems. For details of Ordnance Survey’s licensed partners, who can incorporate the AddressBase products in their systems, please see the systems/software page on the Ordnance Survey website.

Ordnance Survey does not recommend either suppliers or software products as the most appropriate system depends on many factors, such as the amount of data being taken, resources available within the organisation, the existing and planned information technology infrastructure and the applications that AddressBase products can be used for.

However, as a minimum, the following elements will be required in any system:

A means of reading the data, either in its native format, or by translating it into a file format or for storage in a database.
A means of storing and distributing the data, perhaps in a database or through a web-based service.
A way of visualising and querying the data, typically a GIS.

Backup provision of the product

You are advised to copy the supplied data to a backup medium.

Typical data volumes

For reading purposes, it is recommended to store the data on a single hard disc. This will speed up the ability of your computer to read the data. Unzipped file sizes for the full supply of each product are as follows:

Product

Unzipped CSV file size

Unzipped GML file size

AddressBase

6Gb

32Gb

AddressBase Plus

16Gb

78Gb

AddressBase Plus Islands

450Mb

2Gb

Data supply

DVD Supply of area of interest

When you receive an order via hard media (DVD), the following files will be supplied for the contracted area of interest (AOI):

Data
Doc
Order_Details.txt

Within the Data directory, data files will be found in their compressed format.

Within the Doc folder, a text file called Label Information.txt will contain information that is printed on the DVD.

The Order_Details text file will provide information about the order, including the order date, currency date and file structure.

DVD supply of Managed Great Britain Sets

When you receive an order of a Managed Great Britain Set (MGBS) via hard media (DVD), the following files will be supplied:

Data
Doc
Resources
readme.txt

There are several items contained within your supply:

Data folder – This folder contains all of your data supply.
Doc folder – This folder contains the Medialis.txt file, which outlines the contents of the data you have been supplied.
Resources folder – This folder contains lookup tables for the local custodian code and AddressBase classification scheme as well as the Header files for the product.
The readme text file – This document provides guidance notes on matters such as the filename referencing used and the directory structure of the DVD.

Secure File Transfer Protocol

With a Secure File Transfer Protocol (SFTP) order, the same folder structure is supplied as in DVD Supply of area of interest. The filenames will be slightly different, reflecting the SFTP order number, and the Docs folder will be empty.

Download

Public Sector Geospatial Agreement (PSGA) customers can download their geographic chunk data for AddressBase and AddressBase Plus as well as a full supply of AddressBase Plus Islands via our download service.

Download instructions

When you click Download data, you will be required to enter a password to access the PSGA Member’s Area. On successful entry to the download service, you will be able to view all your orders in the Member’s Area and download your data.
If you have ordered your data from our online portal, you will be sent an email with a link to a download page.
Within the PSGA Member’s Area, you can order and download the data that you require by clicking on Order Data, which can be found under the Map Data heading.
Once you have selected Order Data, you will be presented with the Order page. From here, you can manage all your orders, including those for AddressBase products.
When you have placed an order for a product, the data will become available as a series of zipped data files. To unzip these files, please refer to Unzipping the data.

Chunked files

The data is supplied as chunked files that cover your selected area. These files are named according to the convention shown below.

When you open your data, you will see a series of zip folders:

Non-geographic chunks

Using AddressBase Plus and Islands as an example:

AddressBasePlus_FULL_2020-01-21_001_csv.zip (Full supply of GB CSV)
AddressBasePlus_ISL_FULL_2020-01-21_001_csv.zip (Full supply of Islands CSV) or
AddressBasePlus_COU_2020-01-21_001_gml.zip (COU supply of GB GML)
AddressBasePlus_ISL_COU_2020-01-21_001_gml.zip (COU supply of Islands GML)

Geographic chunks

Using AddressBase Plus as an example:

AddressBasePlus_FULL_2011-07-29_TQ2020_csv.zip (Full supply of CSV) or
AddressBasePlus_COU_2011-07-29_TQ2020_gml.zip (COU supply of GML)

The AddressBase Plus Islands product is not available in geographic chunks.

Unzipping the data

The GML and CSV data is supplied in a compressed form (ZIP). Some software can access these files directly, while other software will require the files to be unzipped.

To unzip the zipped data files (.zip extension), use an unzipping utility found on most PCs, for example, WinZip. Alternatively, open-source zipping/unzipping software can be downloaded from the Internet, for example, 7-Zip.

When you unzip the files, the data will be extracted as CSV files, which are ready to use. For example, unzipping AddressBase Plus will extract files similar to the chunks below:

Non-geographic chunks

AddressBasePlus_FULL_2020-01-21_001.csv
AddressBasePlus_ISL_FULL_2020-01-21_001.csv

Geographic chunks

AddressBasePlus_2011-07-29_NC4040.csv

Working with CSV data

Preparing the CSV data

These instructions describe how to prepare the CSV format of AddressBase, AddressBase Plus and AddressBase Plus Islands data for processing.

Downloading header files

AddressBase and AddressBase Plus contain different attributes. This means that there is a separate header file for each of product. Download the file that matches your product using the links below. You will use this file in the Appending a header file to the CSV section.

Merging multiple CSV files

Unzip all the CSV files into a single folder. Ensure there are no spaces in your chosen folder path, for example: C:\AddressBase_Data or C:\AddressBase_Plus_Data or C:\AddressBase_Plus_Islands_Data.
We recommend merging all the CSV files together to save time importing individual files. You can do this manually using a text editor such as Notepad or TextPad, but it is much faster to use a .bat batch file or an MS-DOS command as described below.
To use the .bat batch function, copy the following text and paste it into a new Notepad document: copy *.csv mergedABdata.csv In this example, mergedABdata.csv is the output name of the merged file which will be created, but this can be any user-defined filename with the extension .csv.
Save the Notepad document with the file extension .bat (for example, mergedABdata.bat) in the same directory as the CSV files unzipped previously (for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data).
Close the .bat file and navigate to the directory where you just saved it. Double-click on the .bat file (for example, mergedABdata.bat) and an MS-DOS window will run. Once the process is complete, the MS-DOS screen will close automatically.
If you look in the directory containing the AddressBase CSV files and batch file, you’ll see that there is now an additional single file called mergedABdata.csv (or the user-defined filename you picked when creating your batch file).

Appending a header file to the CSV

Download and save the appropriate product CSV header file into the same folder as the merged AddressBase.csv file created in Merging multiple CSV files.
For AddressBase data, copy the relevant text below and paste it into a new Notepad document: copy addressbase-header.csv+ mergedABdata.csv AB_Data.csv For AddressBase Plus and AddressBase Plus Islands data, copy the relevant text below and paste it into a new Notepad document: copy addressbase-plus-header.csv+ mergedAB_Plusdata.csv AB_Plus_Data.csv copy addressbase-plus-header.csv+ mergedAB_Plus_Islands_data.csv AB_Plus_Islands_Data.csv These examples use the name mergedABdata.csv or mergedAB_Plusdata.csv as the file that contains the AddressBase data merged into a single CSV file created above. If you have named this something else, amend that text above accordingly. The order that the documents are referred to in the above text is also important as it states which file is appended to the other. In this instance, the headers CSV file comes first so that the column headers are the first line of the final AddressBase file and the merged data is appended to the column headers.
Save the above Notepad document with the file extension .bat (for example, append.bat) in the same directory as the column headers and the merged AddressBase data (for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data or C:\AddressBase_Plus_Islands_Data).
Close the .bat file and navigate to the directory where it was saved to (for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data). Double-click on the new .bat file (for example, append.bat) and an MS-DOS window will open. Once the process is complete, the MS-DOS screen will close automatically.
Navigate to the directory where the column headers and the merged AddressBase data are located. You will see that a new CSV file has been created, which is the merged column headers and AddressBase data (for example, AddressBase.csv or AddressBase_Plus.csv).

Loading CSV into GIS software

These instructions describe how to load the CSV format of AddressBase, AddressBase Plus and AddressBase Plus Islands data. In these examples, AddressBase Plus data will be used to describe the procedures in various GI systems.

It is assumed that the preparation of the AddressBase, AddressBase Plus or AddressBase Plus Islands CSV data has been carried out as instructed in Preparing the CSV data before attempting to load the data. If it has not been done, the full set of data will not load, and data loaded will not contain header information.

AddressBase, AddressBase Plus and AddressBase Plus Islands are also available from Ordnance Survey as a supply in GML format. Loading GML into most GIS applications requires the use of third-party translation software, which is not covered in this guide. If more information is required in the loading of GML format, please contact Ordnance Survey.

Loading CSV into ArcGIS Pro

Note - These instructions are based on ArcGIS Pro version 2.3.3.

Note - When using CSV data in ArcGIS Pro, it is necessary to have column headings. Please ensure that headings have already been prepared as instructed Preparing the CSV data.

Launch ArcGIS Pro and start a new blank project.
Select a folder to save the project to.
Name your project and click OK. The project will then be created. Note - ArcGIS Pro automatically creates a new File Geodatabase (.gdb) within the project folder created. This is different to the creation process in the older ESRI application ArcMap.
You can add a backdrop map for contextual purposes from the available backdrop maps supplied by ESRI or add one of your own from a different File Geodatabase. In this example, we have added a light grey backdrop map canvas supplied by ESRI.
Open the Catalog pane on the right-hand side of the window and expand the listing to see the File Geodatabase created with the project.
To import the AddressBase or AddressBase Plus data, right-click the File Geodatabase, then select Import and from that sub-menu, select Table. A new Geoprocessing window will display in the right-hand pane.
Click the folder icon on the right-hand side of the Input Rows field. A new dialog will open.
Navigate to the location with the merged AddressBase or AddressBase Plus CSV file with the appended headers that you created in Preparing the CSV data. Select the file and click OK.
Back in the Geoprocessing window, type a name in the Output Name field, then click Run at the bottom of the window.
Once the process has run, a green box will display at the bottom of the Geoprocessing window and the new AddressBase table will be listed in the left-hand panel.
The data has loaded as a non-geometry table.
To make the data visible against the mapping backdrop, the XY Coordinate fields need to be specified.
- In the Contents pane, right-click the AB_Plus table (or the output name you chose) and in the dropdown click Display XY data.
In the Geoprocessing window, the XY Table To Point parameters will be displayed.
- Using the dropdown options, change the X Field to X_COORDINATE or Longitude and the Y Field to Y_COORDINATE or Latitude.
- Then select Projected Co-ordinate Systems > National Grids > Europe > British National Grid. Note – If you selected X and Y as Longitude and Latitude in the step above, then you need to select ETRS89 [EPSG: 4258] instead.
Click Run.
Once the process has run, a green box will appear at the bottom of the Geoprocessing window and the output XYTableToPoint map layer should appear ticked on the left-hand Contents pane. In the Map window, the addresses will now be displayed as point features.
You have now successfully loaded the data in ArcGIS Pro.

Loading CSV into ArcGIS Desktop

Note - These instructions are based on ArcGIS Desktop versions 9.3 and 10.

Note - When using CSV data in ArcGIS , it is necessary to have column headings. Please ensure that headings have already been prepared as instructed Preparing the CSV data.

Launch ArcCatalog as a separate program, or within ArcMap if you are using version 10.
Connect to a folder where the AddressBase data you wish to use can be accessed, for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data. To do this:
- Click File, or select Folder Connections if you are using version 10.
- Click Connect Folder, or in version 10, right-click on Folder connections > Connect Folder and navigate to the relevant folder.
- From the main window, select the folder to connect to and click OK.
The folder should now appear in the navigation window to the left of the screen, or within your Catalog window if you have opened it within ArcGIS Map.
Create a File Geodatabase to store the address data. Using the file tree, go to folder connections and navigate to the directory where you wish to create the File Geodatabase, for example: C:\AddressBase_Geodatabase\AddressBase_Plus. This may need to be set up as a new connection as per the above.
Right-click on the folder where you the File Geodatabase should to be contained, then select New and File Geodatabase.
A File Geodatabase will be created and named by default as New File Geodatabase. Rename the File Geodatabase to a name of your choice.
Right-click on your new File Geodatabase, and select Import > Table (single)…
- For Input Rows, navigate to the location of the CSV data file that contains the merged header and AddressBase or AddressBase Plus data file.
- The Output Location should automatically populate with the location of the File Geodatabase that is to be updated; this should be the File Geodatabase you created above.
- Insert a relevant name for the Output Table, for example: AddressBase_data. Ensure that there are no spaces in the table name. This name will appear under your geodatabase.
Click OK.
To create a map of the locations of the AddressBase records, they need to be geocoded.
- Right-click on the AddressBase table in the geodatabase that you have just created and select Create Feature Class.
- In the XY Table… window, you can use the dropdowns to change the X Field to either X_COORDINATE or Longitude, and the Y Field to Y_COORDINATE or Latitude.
- Click on the Input Coordinates icon and navigate to Projected Co-ordinate Systems > National Grids > Europe > British National Grid. Note – If you selected X and Y as Longitude and Latitude in the step above, then you need to select ETRS89 [EPSG: 4258] instead.
Double-click on the chosen Coordinate System, then click Apply and OK.
Click on the folder icon alongside the Output field and navigate to the File Geodatabase you just created above. If you cannot see the File Geodatabase, ensure that the Save as type box at the bottom of the dialog box is set to File and Personal Geodatabase feature classes.
Type in a name for it and click Save.
Leave the Configuration keyword dropdown menu as DEFAULTS. Click OK. Note – You may need to right-click on the Personal Geodatabase where it was saved and select Refresh in order to see your points. At this stage, if you have completed the steps above in ArcCatalog and not within ArcMap, please continue to follow the steps below. Otherwise, if you have been using version 10 with the catalog inside ArcMap, the data can now be loaded into ArcMap.
In ArcMap, select File > Add Data and navigate to the folder where the File Geodatabase was created above.
Double-click on the File Geodatabase to open it, then select all the files inside.
Click Add.
Once the data has been loaded into ArcMap, you may wish to display more than the ESRI-defined Object ID in the Info tool. To change this:
- Double-click on the spatial dataset.
- Select the Fields tab.
- Change the Primary Display Field to your desired field, for example, UPRN.

Loading CSV into MapInfo Pro

Note - These instructions are based on MapInfo Pro version 12.

Note – MapInfo has a size limit of 2Gb on each table. This equates to a maximum number of approximately 4 million AddressBase records.

Note - When using CSV data in MapInfo, it is not a critical requirement to have column headings. However, for ease of use we recommend using the headings supplied by Ordnance Survey. Instructions on how to merge the data and append the header files can be found in Preparing the CSV data.

Launch MapInfo.
Cancel the Quick Start prompt.
Click File > Open and navigate to the folder that contains the AddressBase data.
In the Files of Type dropdown menu, select Comma delimited CSV (*.csv), then click on the AddressBase data to be loaded. Click Open.
In the next window, tick the Use First Line for Column Titles box and select the character set INSERT CHARACTER SET. Click OK. Note – When adding data this way, the field type classifications and field sizes of each column automatically try to fit the type of data that MapInfo believes is contained within the column and the largest value of that classification found within that column. This means that the classifications and field sizes of some attributes may not match the field types and sizes stated in the Technical Specification. The following instructions outline how to change these columns to match those values:
Select File > Save Copy As… and select the AddressBase table that was loaded. Select Save As… and name the table to be created, then click Save.
Open the table that was just created via File > Open. Navigate to and select the copy of the table you just named. Click Open.
Navigate to Table > Maintenance > Table Structure and select the table to be edited. Click OK.
Here you can change the Type and Width of each attribute to match the ones stated in the technical specification:
Type and Width should be changed for all attributes, apart from the following (due to software-specific dependencies):
- UPRN should be classified as Float.
- All attributes that have a Field Type of Date in the technical specification should be classified as Character with a length of 10.
After all changes have been made, click OK.
To create a map of the location of the AddressBase records, they need to be geocoded:
- Ensure the table of AddressBase records that you wish to geocode is open, then navigate to Table > Create Points.
- Select the table you wish to geocode from the Create Points for Table dropdown menu.
- Expand the Get X Coordinates from Column dropdown menu and select either X_Coordinate or Longitude.
- Expand the Get Y Coordinates from Column dropdown menu and select either Y_Coordinate or Latitude.
- Click on the Projection icon, then select the British Coordinate Systems option from the Category dropdown menu. Select the British National Grid [EPSG: 27700], or if you selected Longitude and Latitude in the steps above, select ETRS89 [EPSG: 4258].
- Click OK to close that window and OK again to close the next window.
- Finally, click Window > New Map Window to view the loaded geocoded points.

Loading CSV into QGIS

Note - These instructions are based on QGIS version 2.6.

Launch QGIS and click Settings > Options.
Select CRS from the left-hand menu and check that the Coordinate Reference System is set to British National Grid. Note - Check this is set for both Default CRS for new projects and the CRS for new layers sections. If these are not already set, click Select at the end of each section and type 27700 into the Filter Box to find and select British National Grid. Alternatively, if you intend to use Latitude and Longitude columns, select ETRS89 [EPSG: 4258].
Click OK.
Back in the QGIS UI, go to Layer and select Add Delimited Text Layer.
Click Browse next to the filename and locate the CSV file that was created in Preparing the CSV data, containing the merged header files and AddressBase data.
Select the CSV file and click Open.
Accept the default or create a new layer name for the dataset.
Ensure that the First record has field names box is ticked.
For Field Options, select Decimal separator is comma.
For Geometry Definition, select Point Coordinates.
You should now be able to select the X_Coordinate field for the X Field dropdown and the Y_Coordinate field for the Y Field dropdown if this was not done automatically. Alternatively, if you wish to use the Latitude and Longitude columns, the Longitude column needs to be inserted into the X_COORDINATE field, and the Latitude column needs to be inserted into the Y_COORDINATE field.
Click OK.

Loading CSV into a database

This section describes how to load AddressBase products into a few common databases.

Software dependencies:

ArcMap, ArcGIS Desktop and ArcGIS Server software do not support the BIGINT/NUMBER data type as an Object ID. Bear this in mind if the expectation is to use this data type directly with these ESRI products. An alternative method to facilitate using ESRI software is to store this data as a string and add a new Serial ID to act as the Object ID. If you are loading AddressBase data directly into a database, you may need to increase the column length to accommodate language characters such as '^'. Some databases treat this as an additional character and therefore, if you define the column length according to our specification, there is a chance that the load may fail. Please bear in mind such adjustments may be required depending on the database you use to load the data.

UPRN deletions:

If a UPRN is deleted and then reinserted, this does not compromise the integrity of the UPRN and its use as a primary key. If a delete is issued for a UPRN, this does not mean it will not reappear in subsequent supplies.

These are the reasons why this may happen:

The record has moved in location more than once, moving it out of your Area of Interest (AOI), hence the deletion, but then moved back into your AOI in the future. This would also occur if you altered your AOI.
A record has failed data validation upon a change being made. This can result, dependent on the change being made, in the record being deleted and then reintroduced when the error is fixed by the data supplier.

If a UPRN is deleted, it will not be reallocated to a different property and it therefore remains the unique identifier for a property.

Loading CSV into a PostGreSQL database

Note - These steps describe how to load AddressBase into a PostGreSQL database using the text files created by following the instructions in Preparing the CSV data to merge the CSV files.

Note - These instructions are based on PostGreSQL version 1.12.3 and assume that you have set-up your database with the PostGIS spatial extension. It is recommended that you have basic understanding of database terminology before following this guide.

Prepare the text files as described in Preparing the CSV data.
Check that there are no carriage returns (extra rows) at the end of the CSV output file as this will result in errors. To do this, open the CSV file and hit End on your keyboard. Your cursor should now be at the end of the last line, and not on any extra line below. If it is on the line below, hit Delete to remove the extra empty row.
Open the PGAdmin tool (this can be found in Windows Start Menu > PostGreSQL).
Either connect to an existing database or create a new database. It is recommended that the encoding is set to UTF-8.
Open the public schema (although in a production environment, it is advised to use a different schema) and create the tables using the following steps:
- Open the SQL query tool.
- Depending on the data to be loaded, download the SQL file from either the AddressBase or AddressBase_Plus_and_Island folder on: https://github.com/OrdnanceSurvey/AddressBase/tree/master/Loading_Scripts/PostgreSQL.
- This SQL file can be opened in a text editor, and the SQL scripts within it can be copied and pasted into the SQL query tool within PostGreSQL.
Once the table has been created, the data can be loaded into each table using the SQL COPY. Adding the CSV option as the first line contains a header record for each table. Please note that the examples below are for AddressBase, then AddressBase Plus and AddressBase Plus Islands, respectively. The path and filename may need to be changed to reflect your data set-up: COPY addressbase FROM 'C:/Address/AddressBase.csv' DELIMITER ', ' CSV HEADER; COPY addressbase_plus FROM 'C:/Address/AddressBase_Plus.csv' DELIMITER ', ' CSV HEADER; COPY addressbase_plus_islands FROM 'C:/Address/AddressBase_Plus_Islands.csv' DELIMITER ', ' CSV HEADER;

Once loaded, you may want to add Primary Keys to the data. However, these can only be added on columns where the data values are unique. Where there are no unique data values, an index may be added which will aid searching. The UPRN provides the only unique value in AddressBase and AddressBase Plus. Primary Keys are added using the following steps:

Right-click on the table name and select Properties.
Select the Constraints tab.
Click the + to add a new primary key.
Enter a name to call the key under the general tab (for example, Key1).
Under the definition tab, select UPRN or any other unique value from the dropdown under columns.

Click Save.

You can also index the data by following these steps:

Right-click on the table name and select Create > Index.
Under general, enter a name (for example, Idx1).
Under the definition tab > Columns, click the +.
Select the UPRN for example, or any other unique value.
Click Save.

Converting coordinates to geometry

A PostGIS extension is required to create geometries. The AddressBase products contain both British National Grid (BNG) and ETRS89 coordinates. The SQL below shows how to create a column for BNG, but it can be altered to utilise the ETRS89 data.

Add a geometry column called geom to make the data usable in a GIS: SELECT AddGeometryColumn ('public', 'addressbase_plus', 'geom', 27700, 'POINT', 2);
Load the data into your new geometry column: UPDATE public.addressbase_plus SET geom = ST_GeomFromText('POINT(' || x_coordinate || ' ' || y_coordinate || ') ', 27700); This sets the geom column in the table to equal the values from the X_coordinate and Y_coordinate columns, with the spatial reference defined as 27700.
Create a spatial index on the data using: CREATE INDEX idx_abp_geom ON public.addressbase_plus USING gist(geom); This adds the index name idx_abp_geom to the same table on the geom column.

Loading CSV into an Oracle database

Note - These instructions assume a basic knowledge of Oracle databases and SQLLDR (the package used to load the CSV files into the database). Other options are available for loading data into Oracle databases.

Using SQLLDR it is not necessary to merge all the AddressBase files into a single file, but it can load the data directly from the file provided as long as it has been unzipped first.

The following steps describe one method for loading a full supply of the data. Sections in italics denote where changes will need to be made to accommodate local file naming.

Copy the data files from the disk to an appropriate location. It is worth noting that the files will need to be unzipped and therefore you will need in the region of 43Gb of free space.
Once the data is copied, the next stage is to unzip the *.zip files to *.csv. This can be done using a package such as Winzip or 7Zip. Please see the data supply page for more information.
With all the files unzipped, the latter stages are easier if you create a list of all the CSV files to be loaded. This can be done using a batch file that writes all the files out to a text file: dir *.csv /b/s >filelisting.txt pause This file will form the basis for loading the control file in a later step.
Go to the OS GitHub repository: https://github.com/OrdnanceSurvey/AddressBase/tree/master/Loading_Scripts/Oracle.
Open the folder of your chosen product and you should see three files. Open the file ending createtable.sql in a text editor.
Within the provided SQL there are references to <TablespaceName>, which need to be changed to the tablespace that is being worked in. When these are changed, copy and paste the SQL into Oracle to create the tables.
Next, create a SQLLDR control file. An example of one of these files is Oracle AddressBase_Control.ctl, which is provided in the folder of the GitHub repository in Step 4 above. Open the SQLLDR control file for your chosen product in a text editor.
Within the file you will see lines referencing INFILE. Populate these INFILE lines with the file listing created in Step 3, with one INFLE command for each file. This tells the process to open each of the files and carry out the other tasks listed below it. Note – The last section of the control file creates the Geometry for the X and Y coordinate (British National Grid) if you want to create a Geometry for the Latitude and Longitude values, this will need to be created separately.
Once this file is created, it can be called from a .bat file to run it on the box that holds the database rather than a remote machine. If you wish to run it from a remote machine, contact your Oracle Administrator who will be able to advise on the best way to do this within your environment. The contents of the .bat file should be similar to the following: @sqlldr <username>/<password>@<service name> control= <name of ctl file created previously> Pause
Once the load has completed the relevant indexes need to be built. The SQL statements to create the indexes can be found in the same GitHub repository linked in Step 4 above. As before, you can copy and paste the SQL statements from a text editor into Oracle to create the indexes. The example table name provided may be different to yours, so check if this needs to be changed before use.

Loading CSV into Microsoft SQL Server

Note - The following instructions assume that users have basic knowledge of Microsoft SQL Server and that the CSV data is already prepared as described in Preparing the CSV data.

Note – There are many ways to load AddressBase products into Microsoft SQL Server; this is just one suggested method for guidance.

Open the SQL Server Management Studio (SSMS).
Right-click on the database you are loading into and select Properties.
Select Options on the left-hand side.
Expand the dropdown box for Recovery Model and select Bulk-logged. This minimises the logfile size, otherwise the default logging for Microsoft SQL Server can cause logfiles to grow over 20Gb and this can cause issues with loading.
Open the SQL Server Management Studio (SSMS) and right-click your database from the left-hand panel.
Navigate to Tasks and click Import Data. This will open the SQL Server Import and Export Wizard.
Click Next.
On the next screen, change your Data Source to Flat File Source.
Use the Browse button to navigate to your CSV file and select it. If you cannot see your files, ensure that the bottom right dropdown box has CSV files (*.csv) selected.
Click Open.
Your CSV file should have a header row already prepared in Preparing the CSV data. Ensure the Column names in the first data row is ticked.
Check that the Text Qualifier is set to a double quote (“). This is to make sure that the quotations in the raw data supply are removed upon loading but that the data remains intact.
On the left-hand side of this screen, select Columns and check that the Column delimiter is set to Comma.
On the left-hand side of the screen, select Advanced.
For each column of data you are loading, you will need to specify a DataType. The Microsoft SQL Server loader defaults each column to a String. The correct Data Type for each column is given in the technical specification:
Once you have changed the Data Types for each column to match those given in the technical specification, click Next.
Check that your table is going to be imported into the correct database and click Next.
On this screen, you can edit the default table name that Microsoft SQL Server has chosen by clicking in the destination box. For example, for AddressBase Plus renaming to [dbo].[ADDRESSBASE_PLUS].
Select Edit Mappings in the bottom right-hand corner.
In the new window, you must remove the tick in the checkbox against the UPRN column, which needs to be the Primary Key of the table. Click OK once the Primary Key alterations have been completed.
Click Next. On this screen, you can check that the Source column and Destination columns are correct.
Click Next. A summary of your import will display. If you want to continue, click Finish.
A report will be generated as your data is imported. Success should appear at the top once complete.
You may need to right-click on your database and click Refresh to see your new table listed.

Setting Primary Keys

To create a Primary Key field, you can run an SQL statement, such as this example for AddressBase Plus below. Note - the columns you are creating these constraints on cannot be null or allowed to be null.

alter table dbo.ADDRESSBASE_PLUS add primary key ([UPRN]);

Creating the point geometry

You can also create point geometry using the X and Y coordinates or the Latitude and Longitude coordinate values. This is achieved by running the following SQL statement:

alter table dbo.ADDRESSBASE_PLUS
add geometry_column as geometry::Point([X_Coordinate],[Y_Coordinate], 27700);

Note – This is using British National Grid coordinates, with 27700 representing the spatial reference of the data. To use the Latitude and Longitude coordinate, the spatial reference should be set to 4258 for ETRS89.

Working with GML data

Loading GML

GML is an XML dialect which can be used to model geographic features. It was designed by the Open Geospatial Consortium (OGC) as a means for people to share information regardless of the applications or technology that they use.

In the first instance, GML was used to overcome the differences between different GIS applications by providing a neutral file format as an alternative to proprietary formats. Because it is independent of applications, it can also be moved between databases or other types of application, which allows a wider application than just GIS data transfer.

GML data can be viewed and loaded into a database using software such as Safe FME: https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/gml/gml.htm

Working with COU data

All the AddressBase products are available as a full supply or a COU. A COU means you will only be supplied with the features which have changed since your last supply. The following sections provide guidance on how you could potentially manage a COU supply of AddressBase and AddressBase Plus data.

If you receive a tile supply, you will receive Change Chunks. This means if a record within your tile has changed, all of the records in that tile will be provided to you as inserts, and no updates or deletes will be issued.

Tiles are only available for GB supplies, so this does not apply to AddressBase Plus Islands.

Types of change

At a high-level, there are three types of change found within a COU:

Deletes (CHANGE_TYPE ‘D’) are objects that have ceased to exist in your AOI since the last product refresh.
Inserts (CHANGE_TYPE ‘I’) are objects that have been newly inserted into your AOI since the last product refresh.
Updates (CHANGE_TYPE ‘U’) are objects that have been updated in your AOI since the last product refresh.

High-level COU implementation model

The diagram below shows how to implement an AddressBase, AddressBase Plus and AddressBase Plus Islands COU within a database.

High-level COU implementation model – with archiving

Before a COU is applied, there may be a business requirement to archive existing address records. The table below shows how to implement archiving with an AddressBase COU within a database.

Applying COU to tables

Within AddressBase and AddressBase Plus there will be no records with the same UPRN. This can be tested by checking the number of records that have the same UPRN. The following SQL code would notify you of any duplicates:

SELECT uprn, COUNT(uprn) AS NumOccurrences FROM addressbase_plus
GROUP BY uprn
HAVING ( COUNT(uprn) > 1 );

This query should return 0 rows, and this confirms that there are no duplicates. As there are no duplicate records, we can use the UPRN to apply the COU.

Once confirmed, the following steps can be taken to apply the COU (without archiving):

Initially delete the existing records that will be updated and deleted:

DELETE FROM addressbaseplus WHERE uprn IN (SELECT uprn FROM addressbaseplus_cou WHERE change_type != 'I');

Insert the new updated records and the new inserted records:

INSERT INTO addressbaseplus SELECT * FROM addressbaseplus _cou WHERE change_type != 'D';

Where there is a business requirement to keep the records that are being Updated and Deleted in a separate archive table, the following SQL will create an Archive Table. It will populate with records that are being Updated and Deleted from the live AddressBase or AddressBase Plus table.

The following command creates an archive table of the records that are being updated and deleted from the existing table.

If this table already exists, you can simply use INSERT INTO rather than CREATE TABLE.

CREATE TABLE addressbaseplus_archive AS SELECT * FROM addressbaseplus
WHERE uprn IN (SELECT uprn FROM addressbaseplus_cou WHERE change_type != 'I');

The following command then deletes the records from the existing table, which are either updates or deletions:

DELETE FROM addressbaseplus
WHERE uprn IN (SELECT uprn FROM addressbaseplus_cou WHERE change_type!= 'I');

The following command then inserts the new insert records and the new updated records into the live table:

INSERT INTO addressbaseplus SELECT * FROM addressbaseplus_cou WHERE change_type != 'D';

Creating a single-line or multi-line address

The AddressBase products provide a variety of data fields, allowing you to construct different forms of an address for a given addressable object, dependent on how the address is to be used.

AddressBase contains the Delivery Point Address which is sourced from Royal Mail’s Postcode Address File (PAF) – a non-geocoded list of addresses. These addresses are used primarily as a ‘mailing list’ for postal purposes.

There are two types of address contained in the AddressBase Plus products:

Delivery Point Address
Geographic Address

These two address types come from different sources and are matched together by GeoPlace.

As noted above, the Delivery Point Address is sourced from Royal Mail’s PAF data. Geographic Addresses are maintained by contributing Local Authorities. The structure of a Geographic Address is based on the British Standard BS7666. These addresses are used to provide an accurate geographic locator for an object to support, for example, service delivery, asset management, or command and control operations. They also represent the legal form of addresses as created under street naming and numbering legislation.

Each UPRN in AddressBase Plus provides the Geographic Address and, where matched, the Delivery Point Address in a one-to-one relationship. If there is no match, then the following fields will be left empty:

DEPARTMENT_NAME

Background

A common requirement for customers using the AddressBase products is to build a single address label from core address elements.

There are two types of address label. The simplest is a full address on a single line with different elements separated by commas and spaces. This type of label is suited for displaying a full address within a tabular display, such as within an on-screen data grid or spreadsheet, or where a single-line printed address is most appropriate (such as within the text, header or footer of a letter):

ROSE COTTAGE, 5 MAIN STREET, ADDRESSVILLE, LONDON, SE99 9EX

The other type of formatted address is a multi-line address label. These are most often used on envelopes or at the tops of letters, where different parts of an address are separated onto different lines:

The rules in this guide are suggestions only and can be used for visual display of full addresses. It is strongly recommended that address components are stored in the format in which they are provided in order to allow maximum flexibility of use and derived value.

Delivery Point Address (PAF Address)

A Delivery Point Address contains information sourced from Royal Mail (PAF). Stringent rules are used to match these addresses to the Geographic Address and assign a common UPRN to link addresses from the two addressing sources together in the data model.

To construct a single address label based purely on the Royal Mail PAF address fields, the following attributes can be used to build a Delivery Point Address label.

The table below provides details of the Delivery Point Address Components.

These address components are listed in the correct order in which they should appear on an address label. There may be a business need to replace the thoroughfare, locality and post_town attributes with the Welsh equivalent. The following examples use the English version of these attributes only.

It should be noted that most of the PAF fields are optional and may contain null values (or zero, in the cases of ‘BUILDING NUMBER’ and ‘PO BOX NUMBER’). In these cases, those fields should be omitted.

The following (entirely fictional) example shows all of the PAF fields filled in (apart from the PO Box number) and indicates how they should be ordered in a single address label.

In cases where a PO BOX NUMBER is present, it will only be described in the data as an integer. In order to properly format these addresses when generating an address label, these integers should be prefixed with the text ‘PO BOX’, as shown in the following example:

Where null or empty, string values exist (for character fields) or zeros or nulls (for integer fields), those fields should be entirely omitted from the output. However, the order in which the fields should be concatenated always remains the same.

Building a single-line Delivery Point Address

Building a single-line, formatted address for a Delivery Point is relatively straightforward. All the fields should be checked in the order shown previously in Table 1, and those that have values should be concatenated together into a single line. Generally, address components should be separated by a comma followed by a single space (‘, ’), although sometimes only a space is used between a building number and a thoroughfare name. You can use your preference.

The SQL operator for concatenating text is a double pipe (‘||’).
CASE blocks have been used to test each of the fields for null values before concatenating its contents (along with a suitable separator – either ‘, ’ or ‘ ’).
The field names and table names used are illustrative and may vary between databases.
Depending on the database schema and data loading method used, it may be necessary to test some fields for empty strings (‘’) or zero values (for integer fields) instead of, or as well as, testing for NULLs.
If you are using PostGres (PostGIS), it might be beneficial to substitute the ‘IS NOT NULL’ with != ‘’. This should improve the overall appearance of the output.

Building a multi-line Delivery Point Address

Splitting a Delivery Point Address into multiple lines is more complicated. There are several rules to consider in order to avoid having very short lines (for example, just a building number) or very long lines within the formatted address. A summary of these rules is as follows:

Generally, if there is a building number, it should appear on the same line as the thoroughfare (or dependent thoroughfare) name. If there is no thoroughfare name information, it should appear on the same line as the first locality name.
In cases where building numbers have been placed in the building name field due to the presence of a letter suffix (for example, ‘11A’) or a number range separator (for example, ‘3–5’), these should be detected and placed on the same line as the thoroughfare name in the same way as a building number (or on the first locality line if no thoroughfare name is present).
In most other cases, the building name, if present, should appear on a separate line above the thoroughfare (or dependent thoroughfare) name. If there is no thoroughfare name present, it should appear on the same line as the first locality name.
Similar tests should be applied to the SUB_BUILDING_NAME field: if this field contains a number, a number with a suffix, or a numeric range, it should precede the building name on the same line. In most other cases, it should appear on a separate line above the building name.

Geographic Address (Local Authority Address)

The structure of a Geographic Address is based on the British Standard BS7666 and is split into several components. This means that in order to construct a complete address label (for example, on an envelope, database form or GIS display), the components need to be constructed according to a set of rules.

Within the AddressBase Plus products, the core property level address information is stored within the Primary Addressable Object (PAO) and Secondary Addressable Object (SAO) fields. The additional attribution required to build a full address label are the la_organisation, street_description, locality, town_name, administrative_area and postcode_locator.

For a full description of PAOs and SAOs, and the complete set of AddressBase Plus fields, please refer to the Technical Specification on your respective product:

Constructing a single address label from the Geographic Address fields

To construct a single address label based purely on the BS7666 address fields, the following attributes should be used to build a Geographic Address label.

*ADMINISTRATIVE_AREA is optional because it is common for this field to be the same as the TOWN_NAME. Sometimes, however, this field will help users construct a more complete address.

These address components are listed in the correct order in which they should appear on an address label. There may be a business need to use alternate language fields for the SAO_TEXT, PAO_TEXT and STREET_DESCRIPTION, which are also listed in the correct order above.

Rendering SAOs and PAOs

When building a single address label, it may be necessary to concatenate the various SAO fields and PAO fields together respectively. These fields contain any property names, numbers, number ranges or suffixes that apply to an address.

A PAO number/range string should be constructed from the PAO_START_NUMBER, PAO_START_SUFFIX, PAO_ END_NUMBER and PAO_END_SUFFIX fields, as illustrated in the following table.

Similarly, a SAO number/range string should be constructed from the SAO_START_NUMBER, SAO_START_SUFFIX, SAO_END_NUMBER and SAO_END_SUFFIX fields, as illustrated in the following table.

In addition to the numeric range fields described above, there are also PAO_text and SAO_text fields. These fields may be populated instead of, or as well as, the numeric range fields. In both cases, if both text and a numeric range string are present, the text should appear before the numeric range in any formatted address.

For PAOs, there will always be either a text entry or a numeric/range entry, or both. This is not the case for SAOs, which may be entirely absent for a given address.

Street description, town, locality and administrative area names

The street description and administrative area names are always present, while the locality name and town name may be empty.

The ADMINISTRATIVE_AREA field always contains a value; however, this value will not always enhance an address, but in some cases it will. In particular, check that it is not the same as the value in the TOWN_NAME field, as this is often the case.

In other cases, the administrative area name will simply contain the local authority name, which would not traditionally form part of a single or multi-line address but can be included to add additional information to an address label. Its inclusion is largely down to business requirements or personal preference; however, it may also be useful to 'de-duplicate' some Geographic Addresses.

The following (entirely fictional) example shows all of the BS7666 Geographic Address fields filled in and how they should be ordered in a single address label.

*The number/range strings are built from the relevant PAO/SAO START_NUMBER, START_SUFFIX, END_NUMBER and END_SUFFIX fields, as described above, and formatted as character strings.

Where an administrative area matches the town name, it should always be omitted.

Where null or empty string values exist (for character fields) or zeros or nulls (for integer fields), those fields should be entirely omitted from the output; however, the order in which the fields should be concatenated always remains the same.

Building a single-line Geographic Address

The SQL operator for concatenating text is a double pipe (‘||’).
CASE blocks have been used to test each of the fields for null values before concatenating its contents (along with a suitable separator – either ‘, ’ or ‘ ’).
The field names and table names used are illustrative and may vary between databases.
Depending on the database schema and data loading method used, it may be necessary to test some fields for empty strings (‘’) or zero values (for integer fields) instead of or as well as testing for NULLs.

Building a multi-line Geographic Address

Splitting a Geographic Address into multiple lines is more complex. As with Delivery Point Addresses, there are several rules to consider in order to avoid having very short lines (for example, just a building number) or very long lines within the formatted address.

A summary of these rules is as follows:

Generally, if there is a PAO number/range string, it should appear on the same line as the Street Description. For example: 11A MAIN STREET
If there is a PAO_text value, it should always appear on the line above the Street Name (or on the line above the <PAO number string> + <Street Name> where there is a PAO number/range).
If there is a SAO_text value, it should appear on a separate line above the PAO_text line (or the PAO number/range + Street Name where there is no PAO_text value).
If there is a SAO number/range value, it should be inserted either on the same line as the PAO_text (if there is a PAO_text value), or on the same line as the PAO number/range + Street Name (if there is only a PAO number/range value and no PAO_text value). If there are both PAO_text and a PAO number/range, then the SAO number/range should appear on the same line as the PAO_text, and the PAO number/range should appear on the street line.
If there is a SAO_text value, it should always appear on its own line.
If there is an Organisation Name, it should always appear alone as the top line of the address.
The Locality (if present) should appear on a separate line beneath the Street Description, followed by the Town Name on the line below it. If there is no Locality, the Town Name should appear alone on the line beneath the Street Description.
If the Administrative Area name is required and it is not a duplicate of the Town Name, it can optionally be included on a separate line beneath the Town Name.
Finally, the Postcode Locator should be inserted on the final line of the address.

Creating mailing lists

Given that AddressBase Plus contains two different types of address, a decision needs to be made as to whether to use the Geographic or Delivery Point Addresses, or a mixture.

The following two options should be considered:

Use Delivery Point Addresses whenever they are available, and when they are not, use a Geographic Address.
Use Geographic Addresses in all cases.

Depending on business requirements, in some user interfaces it may be worth considering displaying both forms of an address where possible, since this will provide the maximum information available about a given UPRN.

‘Mixing and matching’ components from the two different forms of address into a single address label is not recommended as this is likely to cause confusion in some instances.

Other filters

AddressBase Plus offers other attributes that could be used in conjunction with address labels. For example, classification can be used to target certain types of property, or OS MasterMap Topography TOID cross references can be used to link address labels to Topographic objects and viewed in a GIS.

TOID cross references are not available in AddressBase Plus Islands.

Searching for addresses

A common requirement for customers using the AddressBase products is to search for properties using full or partial addresses. Address searches may return a large number of addresses, a short list of possibilities, a single match or no results, depending on the search criteria.

There are many methods of implementing an address search, from free text queries through to structured address component searches. This guide will step through two such approaches that may be used when working with AddressBase and/or AddressBase Plus.

These methods are not intended as recommendations; they are merely examples of how to get maximum value out of the product when implementing an address search function.

Free text search

One type of search implementation involves a single ‘search engine’ style text box, into which a user can type all or some of an address. For example:

Find address

Results

In this scenario, the user can choose to type anything in Find address, which may be just one component of an address (for example, a postcode, street name or building name), several parts of an address (for example, street name + town name, house name + postcode, etc.) or even (rarely) a complete address.

There may or may not be commas between search items, or address components can be entered with or without capitalised letters, etc. In short, with this search method, there is no structure to the user input and the search methodology must be designed with this in mind.

Structured component search

The other common type of implementation for address searches involves entering search criteria in a structured way (for example, with a different text box for each major address component).

This method guides the user to enter known components of an address and creates a predictable user input structure around which to build a search function. While generally simpler to use and implement, it can be less user-friendly, particularly in cases where it is not obvious which box to type an address component into, for example, is Richmond Terrace a building name or a street?

The search operation

This guide suggests how to implement the two search methods described above. Both should be used alongside the instructions on formatting single address labels.

The methods described here may be adapted to work with both AddressBase Plus, AddressBase Plus Islands and AddressBase; however, in the case of AddressBase, only Delivery Point Addresses are searchable, so the geographic guidance will not apply to this product.

An address search operation typically requires two stages of interaction from a user and several processing steps from the underlying IT system. These steps can be summarised in the following diagram:

The second user interaction can be omitted if there is only one result returned from the query. In almost all cases, there should be an option to ‘search again’ at the second and third stages in case no results are returned, or if none of the options shown is the required address.

Of course, different applications require different approaches; however, the general principles of the above process apply in all cases where an address is searched for based on user-entered criteria.

Generating a search query from structured user input

Within an interface that accepts structured user input for an address search, it is necessary to ‘map’ the fields presented to the user with those found within AddressBase or AddressBase Plus. In particular, any query will need to test multiple fields for a given input and will need to combine result sets from the two different address formats of AddressBase Plus (or the single address format of AddressBase) in order to produce the most complete result set.

Generally, a search form will describe a simplified view of an address in order to keep the user interface tidy and intuitive. Users may be given a set of text boxes to fill in, generally including building name, building number, street name, locality name, town name and postcode. The relationships between some common search fields and the fields found in AddressBase Plus are as follows:

The above mapping is an example only, and it is possible to breakdown the search fields differently, in which case, a different mapping would be required. The important thing is to consider all possibilities for how data might be recorded. For example, a business name can sometimes appear as an organisation name or a building/PAO name depending on circumstances, so both must be checked when creating a search query.

Numbers need to be handled very carefully due to the presence of suffixes and ranges. There are two options for structuring the search input in these cases:

A single ‘number’ box can be used (as shown above in Flat/Subdivision Number and Building Number), which will then require some string manipulation to split the input into the appropriate numeric range and suffix components in order to search the geographic addresses; or
Four boxes can be provided for each number (start number, start suffix, end number and end suffix), which would then need to be combined into an appropriate string to search the Delivery Point Addresses.

Structuring the query for a structured address search

The basic rules to adhere to when generating a search query from structured input are as follows:

Ignore any search boxes that are not filled in with values.
Where a value is entered, assume that a match on at least one of the mapped fields is essential.

In SQL query terms, this means that each search term should generate a sub-query that searches each of the mapped fields (using OR), and that these sub-queries should then be combined together (using AND) into a single search query. The following SQL code illustrates this (for the Delivery Point Address search only) for an example where a street, locality and town name have been entered by the user:

SELECT dp.UPRN, GetFormattedAddress(dp.*) FROM abp dp

WHERE (dp.thoroughfare = streetsearchtext OR dp.dependent_thoroughfare = streetsearchtext) AND (dp.dependent_locality = localitysearchtext OR dp.double_dependent_locality = localitysearchtext) AND (dp.dependent_locality = townsearchtext OR dp.post_town = townsearchtext)

On top of this, for a complete query, the two different types of addresses should be queried separately (Geographic and Delivery Point Addresses), and the two result sets should be amalgamated into a single set using a UNION. The following example builds upon the previous example to include Geographic Addresses as well as Delivery Point Addresses.

SELECT UPRN, GetFormattedAddress(*) FROM abp

WHERE (thoroughfare = streetsearchtext OR dependent_thoroughfare = streetsearchtext) AND (dependent_locality = localitysearchtext OR double_dependent_locality = localitysearchtext) AND (dependent_locality = townsearchtext OR post_town = townsearchtext)

UNION

SELECT uprn, GetFormattedAddress(*) FROM abp WHERE (.street_name = streetsearchtext OR .pao_text = streetsearchtext) AND
(.locality = localitysearchtext OR .town = localitysearchtext OR .street_name = localitysearchtext) AND (.town = townsearchtext OR .locality = townsearchtext)

The SQL UNION operator will combine the two result sets, discarding any exact duplicates. (Retaining the exact duplicates requires the use of UNION ALL, but that is not desirable in this example.)

The resulting output from this query will be a set of search results as formatted addresses along with their UPRN. Exact duplicates will be omitted, but all ‘variations’ of the same address will be output (one row for each variation, with the same UPRN repeated more than once potentially). It may be wise to return the Postal Address Flag values against each to enable further filtering, for example, to restrict the results to postal addresses only. Note that the Postal Address Flag is only available in AddressBase Plus. All records in AddressBase are deemed postal as they are from Royal Mail’s PAF data.

Supporting case-insensitive queries and partial matches

A flaw in the above examples is the use of equality operators. In practice, because people do not tend to be consistent with capitalisation of letters, the SQL ‘LIKE’ operator might work better, and depending on the nature of the application, a ‘%’ wildcard could be appended to the end of each search term to allow only the first few letters of an address component to be entered. For example:

post_town LIKE townsearchtext -- Case insensitive search in some databases ;
post_town LIKE (townsearchtext || ‘%’) -- Matches post towns that start with the search text ;
post_town LIKE (‘%’ || townsearchtext || ‘%’) -- Matches post towns that contain the search text;

Alternatively, if exact matches are required but case sensitivity is not, then the UPPER() or LOWER() SQL functions can be used on each side of the equals sign in comparisons (a solution that should work in all databases):

UPPER(post_town) = UPPER(townsearchtext) -- Case insensitive equality;

Finally, to combine all of the approaches, the following would work for maximum flexibility:

UPPER(post_town) LIKE (‘%’ || UPPER(townsearchtext) || ‘%’)

Generating a search query from unstructured user input

When offering a ‘search engine’ style search feature with just a single text box to enter search terms, a wholly different approach is required. No assumptions can be made about the order, format or style of the user input, and the data will need to be ‘indexed’ in a way that facilitates searches of this type.

Creating a search index for addresses

Search engine style searches are likely to require the creation of an additional index/lookup table for addresses. Such a table is likely to consist of just two main columns: a key value (UPRN) and a formatted address string. Additional columns may be required to allow filtering of results (such as the AddressBase Postal flag values from AddressBase Plus, which would allow the results to be filtered by different address statuses).

The following table shows a possible address index table structure:

Note how the addresses have been formatted as a single text string with a single space between each word (although leaving commas in would do no harm). All forms of each address (both PAF and geographic) have been added to the index, so there can be several rows with the same UPRN. To speed up complex searching, an appropriate index could be added to the Address Text field, such as a full text search index.

Structuring the query for an unstructured address search

Once a suitable search index is in place, the query itself can be put together. The basic idea is to split the user input into search terms by removing commas, double spaces, and other unnecessary whitespace and then splitting it at each single space, as follows:

User input: 4, High Street, westville, wv17

Capitalised, with commas and double-spaces removed:

4 HIGH STREET WESTVILLE WV17

Split into separate search terms:

4
HIGH
STREET
WESTVILLE
WV17

Once the user input has been pre-processed into separate search terms, a query can be generated. The key assumption in this example will be that ALL search terms must be matched against the index table to be considered as a result. This implies a query where each value is matched using an ‘AND’ operator. In order to search the whole index, the ‘LIKE’ operator will need to be used along with a ‘%’ wildcard on either side of the search text. A suitable search query for the above example would be as follows:

SELECT UPRN, AddressText FROM AddressSearchIndex 
WHERE
AddressText LIKE ‘%4%’ AND 
AddressText LIKE ‘%HIGH%’ AND 
AddressText LIKE ‘%STREET%’ AND 
AddressText LIKE ‘%WESTVILLE%’ AND 
AddressText LIKE ‘%WV17%’;

This query would return all rows from the index table that contain all of the search terms, along with the appropriate UPRNs. The following table shows how the index table would be used in the above example to return relevant results:

This result set can then be presented to the user, who can select the most appropriate record, which can then be retrieved in full using the UPRN.

Of course, in a practical implementation, the above query would need to be dynamically generated, with a separate condition added for each search term. This example is quite a strict search query that requires all search terms to be present. Many layers of complexity could be added to allow partial and ‘fuzzy’ matches, and to return confidence scores, for example, but such enhancements are beyond the scope of this guide.

Summary

This guide is intended as an introduction to implementing address search functionality using AddressBase, AddressBase Plus and AddressBase Plus Islands. The following list is a summary of the main points:

A user front-end for an address search may contain a single, search engine style text box or multiple text boxes representing different parts of an address.
A typical address search function takes place in three stages:
- A user enters search text.
- A query is run, returning a set of possible matches.
- The user selects the address of interest and the full record is then returned.
With a structured search interface, the addresses can be queried directly by mapping the various address fields to the text boxes supplied.
For an unstructured (single text box) interface, it is necessary to create an index table with fully formatted address strings against each UPRN. Queries can then be run against this index table by splitting the user input into individual search terms and requiring them all to be present.
It is possible to filter results by status in AddressBase Plus (for example, postal or non-postal).
Any search function should search all forms of an address (both Geographic and Delivery Point Addresses).
Careful consideration should be given to the use of ‘fuzzy’ search algorithms (such as using wildcard or sound-alike searches).

AddressBase Technical Specification

This technical specification provides detailed technical information about AddressBase. It is targeted at technical users and software developers.

AddressBase provides an address product containing both residential and commercial addresses where a Local Authority address has been matched to a Royal Mail PAF address. This allows users to link additional information about a property to a single address. The product also provides enhancements to the Royal Mail PAF data by assigning an X and Y coordinate on British National Grid and an ETRS89 projection, as well as a primary level classification, and a representative point code describing the positional quality.

This technical specification includes the following sections:

All AddressBase products include the and are based on same .

Please see the for additional information that applies across all AddressBase products.

Data formats

The AddressBase product will be distributed as a comma-separated values (CSV) file or Geography Markup Language (GML) version 3.2. Both of these formats can either be supplied as a full supply or a change-only update (COU) supply.

CSV

The CSV supply of AddressBase means:

There will be one record per line in each file.
Fields will be separated by commas.
String fields will be delimited by double quotes.
No comma will be placed at the end of each row in the file.
Records will be terminated by Carriage Return / Line Feed.
Double quotes inside strings will be escaped by doubling.

Where a field has no value in a record, two commas will be placed together in the record (one for the end of the previous field and one for the end of the null field). Where the null field is a text field double quotes will be included between the two commas, for example - , “”,

AddressBase CSV data will be transferred using Unicode encoded in UTF-8. Unicode includes all the characters in ISO-8859-14 (Welsh characters). Some accented characters are encoded differently.

The transfer will normally be in a single file, but the data can be split into multiple files using volume numbers. Most files will only be split where there are more than one million records.

The header row for the CSV is supplied separately and can be downloaded from the product support pages.

GML

The GML Encoding standard is an Extensible Markup Language (XML) grammar for expressing geographical features. XML schemas are used to define and validate the format and content of GML. The XML specifications that GML is based on are available from the World Wide Web Consortium (W3C) website: More information can be found in the Open Geospatial Consortium (OGC) document, Geography Markup Language v3.2.1: The GML 3.2.1 specification provides a set of schemas that define the GML feature constructs and geometric types. These are designed to be used as a basis for building application-specific schemas, which define the data content.

A GML document is described using a GML Schema. The AddressBase schema document (addressbase.xsd), defines the features in AddressBase GML.

The application schema uses the following XML namespaces, for which definitions are available as given here:

Features

Each feature within the AddressBaseSupplySet:FeatureCollection is encapsulated in the following member element according to its feature type:

The UPRN of the feature is provided in the XML attribute of the gml:id

<abpl:addressMember>
<abpl:Address gml:id=”uk.geoplace.uprn.1000011535314”>
………………..
</abpl:Addrress>
</abpl:addressMember>

Envelope

In the GML supply you can determine the extent of your supply by the <gml: Envelope>. For example:

<gml:boundedBy>
<gml:Envelope srsName=”urn:ogc:def:crs:EPSG::27700”>
<gml:lowerCorner>82643.6 5333.6</gml:lowerCorner>
<gml:upperCorner>655989 657599.5</gml:upperCorner>
</gml:Envelope>
</gml:boundedBy>

Supply and update

The primary supply mechanism of AddressBase data is referred to as non-geographic chunks. This is a way of dividing up the data into chunks that are supplied in separate volumes, which have a fixed maximum number of records. The supply is not supplied with any reference to the geographic position of records.

Public Sector Geospatial Agreement (PSGA) customers can order Geographic chunks (5km tiles) as well as non-geographic chunks, although geographic chunks are not considered the main form of supply.

All customers are also able to take a complete supply (referred to as a Managed Great Britain Set: MGBS) or an Area of Interest (AOI) as a full supply or a COU supply.

Non-geographic chunks (unzipped)

If you receive your data as non-geographic chunks, the filename will be constructed as follows:

productName_supply_ccyy-mm-dd_vvv.format

Where:

ProductName is AddressBase.
supply is defined as FULL or COU.
ccyy-mm-dd is the date the file was generated.
vvv is the volume number of the file.
format is the format of the files received, for example, csv or gml.

For example:

AddressBase_FULL_2013-05-28_001.gml (GML full supply)
AddressBase_COU_2013-05-28_001.csv (CSV COU supply)

Non-geographic chunks (zipped)

If the data has been provided in a ZIP file, the filename will be constructed as follows:

productName_supply_ccyy-mm-dd_vvv_format.zip

For example:

AddressBase_FULL_2013-05-28_001_gml.zip (GML full supply zipped)

Geographic chunks (unzipped)

If you receive your data as geographic chunks (PSGA customers only), the filename will be constructed as follows:

productName_supply_ccyy-mm-dd_ngxxyy.format

Where:

ProductName is AddressBase.
supply is defined as FULL or COU.
ccyy-mm-dd is the date the file was generated.
ngxxyy is the four-digit grid reference belonging to the 1km south-west corner of the 5km chunk.
format is the format of the files received, for example, csv or gml.

For example:

AddressBase_FULL_2013-05-28_NC4040.gml (GML full supply)
AddressBase_COU_2013-05-28_NC4040.csv (CSV COU supply)

Geographic chunks (zipped)

If the data has been provided in a ZIP file, the filename will be constructed as follows:

productName_supply_ccyy-mm-dd_ngxxyy_format.zip

For example:

AddressBase_COU_2013-05-28_NC4040_csv.zip (CSV COU supply zipped)

COU Supply

AddressBase is available as a full or COU supply.

A COU supply of data contains records or files that have changed between product refresh cycles. The primary benefit in supplying data in this way is that data volumes are smaller therefore reducing the amount of data that requires processing when compared to a full supply.

COU data enables a user to identify three types of change:

Deletes (CHANGE_TYPE ‘D’) are objects that have ceased to exist in your AOI since the last product refresh.
Inserts (CHANGE_TYPE ‘I’) are objects that have been newly inserted into your AOI since the last product refresh.
Updates (CHANGE_TYPE ‘U’) are objects that have been updated in your AOI since the last product refresh.

Non-geographic chunked COU

A COU file for non-geographic chunked data can be identified by its naming convention. Any change record will be provided as a full record with the appropriate change type, as listed above.

Geographic chunked COU (tile-based)

A geographic chunked COU is not supplied as per the non-geographic chunked COU outlined above. Its file naming convention can be found above. If a single record has changed within a specified 5km tile, the entire 5km tile containing all features will be supplied. This means the user will need to remove all features that previously existed in the provided tile(s) and insert the entire new tile(s) in its place.

Archiving

When users are deleting, inserting or updating features, it is up to the user to consider their archiving requirements. If deleted records are important to your business requirements, you must take appropriate action to archive previous records.

AddressBase structure

AddressBase is structured as a flat file. The data structure in this document is described by means of Unified Modeling Language (UML) class diagrams.

The AddressBase product is constructed as per the following UML diagrams.

Model overview CSV

AddressBase CSV

Definition: This address record follows the lifecycle of a Postcode Address File (PAF) record matched to a Local Authority record. As a matched record is inserted, deleted and updated within PAF, these changes are incorporated into the AddressBase product. Similarly, if the matched Local Authority address record updates an attribute contained within the AddressBase product, this change will be reflected.

The UML model of AddressBase in CSV format can be seen in the UML diagram below; classes from the Ordnance Survey product specification are coloured orange; all code lists are coloured blue, while enumerations are coloured green.

UML model of AddressBase in CSV format

Model overview GML

AddressBase GML

The UML model of AddressBase in GML format can be seen in the diagram below. In the UML diagram, classes from the Ordnance Survey product specification are orange, all code lists are coloured blue and enumerations are green.

UML model of AddressBase in GML format

Working with CSV data

Preparing the CSV data

These instructions describe how to prepare the CSV format of AddressBase, AddressBase Plus and AddressBase Plus Islands data for processing.

Downloading header files

Merging multiple CSV files

Unzip all the CSV files into a single folder. Ensure there are no spaces in your chosen folder path, for example: C:\AddressBase_Data or C:\AddressBase_Plus_Data or C:\AddressBase_Plus_Islands_Data.
We recommend merging all the CSV files together to save time importing individual files. You can do this manually using a text editor such as Notepad or TextPad, but it is much faster to use a .bat batch file or an MS-DOS command as described below.
To use the .bat batch function, copy the following text and paste it into a new Notepad document: copy *.csv mergedABdata.csv In this example, mergedABdata.csv is the output name of the merged file which will be created, but this can be any user-defined filename with the extension .csv.
Save the Notepad document with the file extension .bat (for example, mergedABdata.bat) in the same directory as the CSV files unzipped previously (for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data).
Close the .bat file and navigate to the directory where you just saved it. Double-click on the .bat file (for example, mergedABdata.bat) and an MS-DOS window will run. Once the process is complete, the MS-DOS screen will close automatically.
If you look in the directory containing the AddressBase CSV files and batch file, you’ll see that there is now an additional single file called mergedABdata.csv (or the user-defined filename you picked when creating your batch file).

Appending a header file to the CSV

Download and save the appropriate product CSV header file into the same folder as the merged AddressBase.csv file created in Merging multiple CSV files.
For AddressBase data, copy the relevant text below and paste it into a new Notepad document: copy addressbase-header.csv+ mergedABdata.csv AB_Data.csv For AddressBase Plus and AddressBase Plus Islands data, copy the relevant text below and paste it into a new Notepad document: copy addressbase-plus-header.csv+ mergedAB_Plusdata.csv AB_Plus_Data.csv copy addressbase-plus-header.csv+ mergedAB_Plus_Islands_data.csv AB_Plus_Islands_Data.csv These examples use the name mergedABdata.csv or mergedAB_Plusdata.csv as the file that contains the AddressBase data merged into a single CSV file created above. If you have named this something else, amend that text above accordingly. The order that the documents are referred to in the above text is also important as it states which file is appended to the other. In this instance, the headers CSV file comes first so that the column headers are the first line of the final AddressBase file and the merged data is appended to the column headers.
Save the above Notepad document with the file extension .bat (for example, append.bat) in the same directory as the column headers and the merged AddressBase data (for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data or C:\AddressBase_Plus_Islands_Data).
Close the .bat file and navigate to the directory where it was saved to (for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data). Double-click on the new .bat file (for example, append.bat) and an MS-DOS window will open. Once the process is complete, the MS-DOS screen will close automatically.
Navigate to the directory where the column headers and the merged AddressBase data are located. You will see that a new CSV file has been created, which is the merged column headers and AddressBase data (for example, AddressBase.csv or AddressBase_Plus.csv).

Loading CSV into GIS software

Loading CSV into ArcGIS Pro

Note - These instructions are based on ArcGIS Pro version 2.3.3.

Note - When using CSV data in ArcGIS Pro, it is necessary to have column headings. Please ensure that headings have already been prepared as instructed Preparing the CSV data.

Launch ArcGIS Pro and start a new blank project.
Select a folder to save the project to.
Name your project and click OK. The project will then be created. Note - ArcGIS Pro automatically creates a new File Geodatabase (.gdb) within the project folder created. This is different to the creation process in the older ESRI application ArcMap.
You can add a backdrop map for contextual purposes from the available backdrop maps supplied by ESRI or add one of your own from a different File Geodatabase. In this example, we have added a light grey backdrop map canvas supplied by ESRI.
Open the Catalog pane on the right-hand side of the window and expand the listing to see the File Geodatabase created with the project.
To import the AddressBase or AddressBase Plus data, right-click the File Geodatabase, then select Import and from that sub-menu, select Table. A new Geoprocessing window will display in the right-hand pane.
Click the folder icon on the right-hand side of the Input Rows field. A new dialog will open.
Navigate to the location with the merged AddressBase or AddressBase Plus CSV file with the appended headers that you created in Preparing the CSV data. Select the file and click OK.
Back in the Geoprocessing window, type a name in the Output Name field, then click Run at the bottom of the window.
Once the process has run, a green box will display at the bottom of the Geoprocessing window and the new AddressBase table will be listed in the left-hand panel.
The data has loaded as a non-geometry table.
To make the data visible against the mapping backdrop, the XY Coordinate fields need to be specified.
- In the Contents pane, right-click the AB_Plus table (or the output name you chose) and in the dropdown click Display XY data.
In the Geoprocessing window, the XY Table To Point parameters will be displayed.
- Using the dropdown options, change the X Field to X_COORDINATE or Longitude and the Y Field to Y_COORDINATE or Latitude.
- Change the Coordinate System to British_National_Grid by clicking the globe icon .
- Then select Projected Co-ordinate Systems > National Grids > Europe > British National Grid. Note – If you selected X and Y as Longitude and Latitude in the step above, then you need to select ETRS89 [EPSG: 4258] instead.
Click Run.
Once the process has run, a green box will appear at the bottom of the Geoprocessing window and the output XYTableToPoint map layer should appear ticked on the left-hand Contents pane. In the Map window, the addresses will now be displayed as point features.
You have now successfully loaded the data in ArcGIS Pro.

Loading CSV into ArcGIS Desktop

Note - These instructions are based on ArcGIS Desktop versions 9.3 and 10.

Note - When using CSV data in ArcGIS , it is necessary to have column headings. Please ensure that headings have already been prepared as instructed Preparing the CSV data.

Launch ArcCatalog as a separate program, or within ArcMap if you are using version 10.
Connect to a folder where the AddressBase data you wish to use can be accessed, for example, C:\AddressBase_Data or C:\AddressBase_Plus_Data. To do this:
- Click File, or select Folder Connections if you are using version 10.
- Click Connect Folder, or in version 10, right-click on Folder connections > Connect Folder and navigate to the relevant folder.
- From the main window, select the folder to connect to and click OK.
The folder should now appear in the navigation window to the left of the screen, or within your Catalog window if you have opened it within ArcGIS Map.
Create a File Geodatabase to store the address data. Using the file tree, go to folder connections and navigate to the directory where you wish to create the File Geodatabase, for example: C:\AddressBase_Geodatabase\AddressBase_Plus. This may need to be set up as a new connection as per the above.
Right-click on the folder where you the File Geodatabase should to be contained, then select New and File Geodatabase.
A File Geodatabase will be created and named by default as New File Geodatabase. Rename the File Geodatabase to a name of your choice.
Right-click on your new File Geodatabase, and select Import > Table (single)…
- For Input Rows, navigate to the location of the CSV data file that contains the merged header and AddressBase or AddressBase Plus data file.
- The Output Location should automatically populate with the location of the File Geodatabase that is to be updated; this should be the File Geodatabase you created above.
- Insert a relevant name for the Output Table, for example: AddressBase_data. Ensure that there are no spaces in the table name. This name will appear under your geodatabase.
Click OK.
To create a map of the locations of the AddressBase records, they need to be geocoded.
- Right-click on the AddressBase table in the geodatabase that you have just created and select Create Feature Class.
- In the XY Table… window, you can use the dropdowns to change the X Field to either X_COORDINATE or Longitude, and the Y Field to Y_COORDINATE or Latitude.
- Click on the Input Coordinates icon and navigate to Projected Co-ordinate Systems > National Grids > Europe > British National Grid. Note – If you selected X and Y as Longitude and Latitude in the step above, then you need to select ETRS89 [EPSG: 4258] instead.
Double-click on the chosen Coordinate System, then click Apply and OK.
Click on the folder icon alongside the Output field and navigate to the File Geodatabase you just created above. If you cannot see the File Geodatabase, ensure that the Save as type box at the bottom of the dialog box is set to File and Personal Geodatabase feature classes.
Type in a name for it and click Save.
Leave the Configuration keyword dropdown menu as DEFAULTS. Click OK. Note – You may need to right-click on the Personal Geodatabase where it was saved and select Refresh in order to see your points. At this stage, if you have completed the steps above in ArcCatalog and not within ArcMap, please continue to follow the steps below. Otherwise, if you have been using version 10 with the catalog inside ArcMap, the data can now be loaded into ArcMap.
In ArcMap, select File > Add Data and navigate to the folder where the File Geodatabase was created above.
Double-click on the File Geodatabase to open it, then select all the files inside.
Click Add.
Once the data has been loaded into ArcMap, you may wish to display more than the ESRI-defined Object ID in the Info tool. To change this:
- Double-click on the spatial dataset.
- Select the Fields tab.
- Change the Primary Display Field to your desired field, for example, UPRN.

Loading CSV into MapInfo Pro

Note - These instructions are based on MapInfo Pro version 12.

Note – MapInfo has a size limit of 2Gb on each table. This equates to a maximum number of approximately 4 million AddressBase records.

Launch MapInfo.
Cancel the Quick Start prompt.
Click File > Open and navigate to the folder that contains the AddressBase data.
In the Files of Type dropdown menu, select Comma delimited CSV (*.csv), then click on the AddressBase data to be loaded. Click Open.
In the next window, tick the Use First Line for Column Titles box and select the character set INSERT CHARACTER SET. Click OK. Note – When adding data this way, the field type classifications and field sizes of each column automatically try to fit the type of data that MapInfo believes is contained within the column and the largest value of that classification found within that column. This means that the classifications and field sizes of some attributes may not match the field types and sizes stated in the Technical Specification. The following instructions outline how to change these columns to match those values:
Select File > Save Copy As… and select the AddressBase table that was loaded. Select Save As… and name the table to be created, then click Save.
Open the table that was just created via File > Open. Navigate to and select the copy of the table you just named. Click Open.
Navigate to Table > Maintenance > Table Structure and select the table to be edited. Click OK.
Here you can change the Type and Width of each attribute to match the ones stated in the technical specification:
Type and Width should be changed for all attributes, apart from the following (due to software-specific dependencies):
- UPRN should be classified as Float.
- All attributes that have a Field Type of Date in the technical specification should be classified as Character with a length of 10.
After all changes have been made, click OK.
To create a map of the location of the AddressBase records, they need to be geocoded:
- Ensure the table of AddressBase records that you wish to geocode is open, then navigate to Table > Create Points.
- Select the table you wish to geocode from the Create Points for Table dropdown menu.
- Expand the Get X Coordinates from Column dropdown menu and select either X_Coordinate or Longitude.
- Expand the Get Y Coordinates from Column dropdown menu and select either Y_Coordinate or Latitude.
- Click on the Projection icon, then select the British Coordinate Systems option from the Category dropdown menu. Select the British National Grid [EPSG: 27700], or if you selected Longitude and Latitude in the steps above, select ETRS89 [EPSG: 4258].
- Click OK to close that window and OK again to close the next window.
- Finally, click Window > New Map Window to view the loaded geocoded points.

Loading CSV into QGIS

Note - These instructions are based on QGIS version 2.6.

Launch QGIS and click Settings > Options.
Select CRS from the left-hand menu and check that the Coordinate Reference System is set to British National Grid. Note - Check this is set for both Default CRS for new projects and the CRS for new layers sections. If these are not already set, click Select at the end of each section and type 27700 into the Filter Box to find and select British National Grid. Alternatively, if you intend to use Latitude and Longitude columns, select ETRS89 [EPSG: 4258].
Click OK.
Back in the QGIS UI, go to Layer and select Add Delimited Text Layer.
Click Browse next to the filename and locate the CSV file that was created in Preparing the CSV data, containing the merged header files and AddressBase data.
Select the CSV file and click Open.
Accept the default or create a new layer name for the dataset.
Ensure that the First record has field names box is ticked.
For Field Options, select Decimal separator is comma.
For Geometry Definition, select Point Coordinates.
You should now be able to select the X_Coordinate field for the X Field dropdown and the Y_Coordinate field for the Y Field dropdown if this was not done automatically. Alternatively, if you wish to use the Latitude and Longitude columns, the Longitude column needs to be inserted into the X_COORDINATE field, and the Latitude column needs to be inserted into the Y_COORDINATE field.
Click OK.

Loading CSV into a database

This section describes how to load AddressBase products into a few common databases.

Software dependencies:

UPRN deletions:

These are the reasons why this may happen:

The record has moved in location more than once, moving it out of your Area of Interest (AOI), hence the deletion, but then moved back into your AOI in the future. This would also occur if you altered your AOI.
A record has failed data validation upon a change being made. This can result, dependent on the change being made, in the record being deleted and then reintroduced when the error is fixed by the data supplier.

If a UPRN is deleted, it will not be reallocated to a different property and it therefore remains the unique identifier for a property.

Loading CSV into a PostGreSQL database

Note - These steps describe how to load AddressBase into a PostGreSQL database using the text files created by following the instructions in Preparing the CSV data to merge the CSV files.

Prepare the text files as described in Preparing the CSV data.
Check that there are no carriage returns (extra rows) at the end of the CSV output file as this will result in errors. To do this, open the CSV file and hit End on your keyboard. Your cursor should now be at the end of the last line, and not on any extra line below. If it is on the line below, hit Delete to remove the extra empty row.
Open the PGAdmin tool (this can be found in Windows Start Menu > PostGreSQL).
Either connect to an existing database or create a new database. It is recommended that the encoding is set to UTF-8.
Open the public schema (although in a production environment, it is advised to use a different schema) and create the tables using the following steps:
- Open the SQL query tool.
- Depending on the data to be loaded, download the SQL file from either the AddressBase or AddressBase_Plus_and_Island folder on: https://github.com/OrdnanceSurvey/AddressBase/tree/master/Loading_Scripts/PostgreSQL.
- This SQL file can be opened in a text editor, and the SQL scripts within it can be copied and pasted into the SQL query tool within PostGreSQL.
Once the table has been created, the data can be loaded into each table using the SQL COPY. Adding the CSV option as the first line contains a header record for each table. Please note that the examples below are for AddressBase, then AddressBase Plus and AddressBase Plus Islands, respectively. The path and filename may need to be changed to reflect your data set-up: COPY addressbase FROM 'C:/Address/AddressBase.csv' DELIMITER ', ' CSV HEADER; COPY addressbase_plus FROM 'C:/Address/AddressBase_Plus.csv' DELIMITER ', ' CSV HEADER; COPY addressbase_plus_islands FROM 'C:/Address/AddressBase_Plus_Islands.csv' DELIMITER ', ' CSV HEADER;

Once loaded, you may want to add Primary Keys to the data. However, these can only be added on columns where the data values are unique. Where there are no unique data values, an index may be added which will aid searching. The UPRN provides the only unique value in AddressBase and AddressBase Plus. Primary Keys are added using the following steps:

Right-click on the table name and select Properties.
Select the Constraints tab.
Click the + to add a new primary key.
Click the edit icon .
Enter a name to call the key under the general tab (for example, Key1).
Under the definition tab, select UPRN or any other unique value from the dropdown under columns.

Click Save.

You can also index the data by following these steps:

Right-click on the table name and select Create > Index.
Under general, enter a name (for example, Idx1).
Under the definition tab > Columns, click the +.
Select the UPRN for example, or any other unique value.
Click Save.

Converting coordinates to geometry

Add a geometry column called geom to make the data usable in a GIS: SELECT AddGeometryColumn ('public', 'addressbase_plus', 'geom', 27700, 'POINT', 2);
Load the data into your new geometry column: UPDATE public.addressbase_plus SET geom = ST_GeomFromText('POINT(' || x_coordinate || ' ' || y_coordinate || ') ', 27700); This sets the geom column in the table to equal the values from the X_coordinate and Y_coordinate columns, with the spatial reference defined as 27700.
Create a spatial index on the data using: CREATE INDEX idx_abp_geom ON public.addressbase_plus USING gist(geom); This adds the index name idx_abp_geom to the same table on the geom column.

Loading CSV into an Oracle database

Using SQLLDR it is not necessary to merge all the AddressBase files into a single file, but it can load the data directly from the file provided as long as it has been unzipped first.

The following steps describe one method for loading a full supply of the data. Sections in italics denote where changes will need to be made to accommodate local file naming.

Copy the data files from the disk to an appropriate location. It is worth noting that the files will need to be unzipped and therefore you will need in the region of 43Gb of free space.
Once the data is copied, the next stage is to unzip the *.zip files to *.csv. This can be done using a package such as Winzip or 7Zip. Please see the data supply page for more information.
With all the files unzipped, the latter stages are easier if you create a list of all the CSV files to be loaded. This can be done using a batch file that writes all the files out to a text file: dir *.csv /b/s >filelisting.txt pause This file will form the basis for loading the control file in a later step.
Go to the OS GitHub repository: https://github.com/OrdnanceSurvey/AddressBase/tree/master/Loading_Scripts/Oracle.
Open the folder of your chosen product and you should see three files. Open the file ending createtable.sql in a text editor.
Within the provided SQL there are references to <TablespaceName>, which need to be changed to the tablespace that is being worked in. When these are changed, copy and paste the SQL into Oracle to create the tables.
Next, create a SQLLDR control file. An example of one of these files is Oracle AddressBase_Control.ctl, which is provided in the folder of the GitHub repository in Step 4 above. Open the SQLLDR control file for your chosen product in a text editor.
Within the file you will see lines referencing INFILE. Populate these INFILE lines with the file listing created in Step 3, with one INFLE command for each file. This tells the process to open each of the files and carry out the other tasks listed below it. Note – The last section of the control file creates the Geometry for the X and Y coordinate (British National Grid) if you want to create a Geometry for the Latitude and Longitude values, this will need to be created separately.
Once this file is created, it can be called from a .bat file to run it on the box that holds the database rather than a remote machine. If you wish to run it from a remote machine, contact your Oracle Administrator who will be able to advise on the best way to do this within your environment. The contents of the .bat file should be similar to the following: @sqlldr <username>/<password>@<service name> control= <name of ctl file created previously> Pause
Once the load has completed the relevant indexes need to be built. The SQL statements to create the indexes can be found in the same GitHub repository linked in Step 4 above. As before, you can copy and paste the SQL statements from a text editor into Oracle to create the indexes. The example table name provided may be different to yours, so check if this needs to be changed before use.

Loading CSV into Microsoft SQL Server

Note - The following instructions assume that users have basic knowledge of Microsoft SQL Server and that the CSV data is already prepared as described in Preparing the CSV data.

Note – There are many ways to load AddressBase products into Microsoft SQL Server; this is just one suggested method for guidance.

Open the SQL Server Management Studio (SSMS).
Right-click on the database you are loading into and select Properties.
Select Options on the left-hand side.
Expand the dropdown box for Recovery Model and select Bulk-logged. This minimises the logfile size, otherwise the default logging for Microsoft SQL Server can cause logfiles to grow over 20Gb and this can cause issues with loading.
Open the SQL Server Management Studio (SSMS) and right-click your database from the left-hand panel.
Navigate to Tasks and click Import Data. This will open the SQL Server Import and Export Wizard.
Click Next.
On the next screen, change your Data Source to Flat File Source.
Use the Browse button to navigate to your CSV file and select it. If you cannot see your files, ensure that the bottom right dropdown box has CSV files (*.csv) selected.
Click Open.
Your CSV file should have a header row already prepared in Preparing the CSV data. Ensure the Column names in the first data row is ticked.
Check that the Text Qualifier is set to a double quote (“). This is to make sure that the quotations in the raw data supply are removed upon loading but that the data remains intact.
On the left-hand side of this screen, select Columns and check that the Column delimiter is set to Comma.
On the left-hand side of the screen, select Advanced.
For each column of data you are loading, you will need to specify a DataType. The Microsoft SQL Server loader defaults each column to a String. The correct Data Type for each column is given in the technical specification:
Once you have changed the Data Types for each column to match those given in the technical specification, click Next.
Check that your table is going to be imported into the correct database and click Next.
On this screen, you can edit the default table name that Microsoft SQL Server has chosen by clicking in the destination box. For example, for AddressBase Plus renaming to [dbo].[ADDRESSBASE_PLUS].
Select Edit Mappings in the bottom right-hand corner.
In the new window, you must remove the tick in the checkbox against the UPRN column, which needs to be the Primary Key of the table. Click OK once the Primary Key alterations have been completed.
Click Next. On this screen, you can check that the Source column and Destination columns are correct.
Click Next. A summary of your import will display. If you want to continue, click Finish.
A report will be generated as your data is imported. Success should appear at the top once complete.
You may need to right-click on your database and click Refresh to see your new table listed.

Setting Primary Keys

alter table dbo.ADDRESSBASE_PLUS add primary key ([UPRN]);

Creating the point geometry

You can also create point geometry using the X and Y coordinates or the Latitude and Longitude coordinate values. This is achieved by running the following SQL statement:

alter table dbo.ADDRESSBASE_PLUS
add geometry_column as geometry::Point([X_Coordinate],[Y_Coordinate], 27700);