Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
AddressBase data is an addressing gazetteer that can be used within GIS and database systems. For details of Ordnance Survey’s licensed partners, who can incorporate the AddressBase products in their systems, please see the systems/software page on the Ordnance Survey website.
Ordnance Survey does not recommend either suppliers or software products as the most appropriate system depends on many factors, such as the amount of data being taken, resources available within the organisation, the existing and planned information technology infrastructure and the applications that AddressBase products can be used for.
However, as a minimum, the following elements will be required in any system:
A means of reading the data, either in its native format, or by translating it into a file format or for storage in a database.
A means of storing and distributing the data, perhaps in a database or through a web-based service.
A way of visualising and querying the data, typically a GIS.
You are advised to copy the supplied data to a backup medium.
For reading purposes, it is recommended to store the data on a single hard disc. This will speed up the ability of your computer to read the data. Unzipped file sizes for the full supply of each product are as follows:
Product | Unzipped CSV file size | Unzipped GML file size |
---|---|---|
AddressBase
6Gb
32Gb
AddressBase Plus
16Gb
78Gb
AddressBase Plus Islands
450Mb
2Gb
GML is an XML dialect which can be used to model geographic features. It was designed by the Open Geospatial Consortium (OGC) as a means for people to share information regardless of the applications or technology that they use.
In the first instance, GML was used to overcome the differences between different GIS applications by providing a neutral file format as an alternative to proprietary formats. Because it is independent of applications, it can also be moved between databases or other types of application, which allows a wider application than just GIS data transfer.
GML data can be viewed and loaded into a database using software such as Safe FME: https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/gml/gml.htm
These instructions describe how to prepare the CSV format of AddressBase, AddressBase Plus and AddressBase Plus Islands data for processing.
These instructions describe how to load the CSV format of AddressBase, AddressBase Plus and AddressBase Plus Islands data. In these examples, AddressBase Plus data will be used to describe the procedures in various GI systems.
It is assumed that the preparation of the AddressBase, AddressBase Plus or AddressBase Plus Islands CSV data has been carried out as instructed in Preparing the CSV data before attempting to load the data. If it has not been done, the full set of data will not load, and data loaded will not contain header information.
AddressBase, AddressBase Plus and AddressBase Plus Islands are also available from Ordnance Survey as a supply in GML format. Loading GML into most GIS applications requires the use of third-party translation software, which is not covered in this guide. If more information is required in the loading of GML format, please contact Ordnance Survey.
This section describes how to load AddressBase products into a few common databases.
ArcMap, ArcGIS Desktop and ArcGIS Server software do not support the BIGINT/NUMBER data type as an Object ID. Bear this in mind if the expectation is to use this data type directly with these ESRI products. An alternative method to facilitate using ESRI software is to store this data as a string and add a new Serial ID to act as the Object ID. If you are loading AddressBase data directly into a database, you may need to increase the column length to accommodate language characters such as '^'. Some databases treat this as an additional character and therefore, if you define the column length according to our specification, there is a chance that the load may fail. Please bear in mind such adjustments may be required depending on the database you use to load the data.
If a UPRN is deleted and then reinserted, this does not compromise the integrity of the UPRN and its use as a primary key. If a delete is issued for a UPRN, this does not mean it will not reappear in subsequent supplies.
These are the reasons why this may happen:
The record has moved in location more than once, moving it out of your Area of Interest (AOI), hence the deletion, but then moved back into your AOI in the future. This would also occur if you altered your AOI.
A record has failed data validation upon a change being made. This can result, dependent on the change being made, in the record being deleted and then reintroduced when the error is fixed by the data supplier.
If a UPRN is deleted, it will not be reallocated to a different property and it therefore remains the unique identifier for a property.
When you receive an order via hard media (DVD), the following files will be supplied for the contracted area of interest (AOI):
Data
Doc
Order_Details.txt
Within the Data directory, data files will be found in their compressed format.
Within the Doc folder, a text file called Label Information.txt will contain information that is printed on the DVD.
The Order_Details text file will provide information about the order, including the order date, currency date and file structure.
When you receive an order of a Managed Great Britain Set (MGBS) via hard media (DVD), the following files will be supplied:
Data
Doc
Resources
readme.txt
There are several items contained within your supply:
Data folder – This folder contains all of your data supply.
Doc folder – This folder contains the Medialis.txt file, which outlines the contents of the data you have been supplied.
Resources folder – This folder contains lookup tables for the local custodian code and AddressBase classification scheme as well as the Header files for the product.
The readme text file – This document provides guidance notes on matters such as the filename referencing used and the directory structure of the DVD.
Public Sector Geospatial Agreement (PSGA) customers can download their geographic chunk data for AddressBase and AddressBase Plus as well as a full supply of AddressBase Plus Islands via our download service.
The data is supplied as chunked files that cover your selected area. These files are named according to the convention shown below.
When you open your data, you will see a series of zip folders:
Using AddressBase Plus and Islands as an example:
AddressBasePlus_FULL_2020-01-21_001_csv.zip
(Full supply of GB CSV)
AddressBasePlus_ISL_FULL_2020-01-21_001_csv.zip
(Full supply of Islands CSV) or
AddressBasePlus_COU_2020-01-21_001_gml.zip
(COU supply of GB GML)
AddressBasePlus_ISL_COU_2020-01-21_001_gml.zip
(COU supply of Islands GML)
Using AddressBase Plus as an example:
AddressBasePlus_FULL_2011-07-29_TQ2020_csv.zip
(Full supply of CSV) or
AddressBasePlus_COU_2011-07-29_TQ2020_gml.zip
(COU supply of GML)
The AddressBase Plus Islands product is not available in geographic chunks.
The GML and CSV data is supplied in a compressed form (ZIP). Some software can access these files directly, while other software will require the files to be unzipped.
To unzip the zipped data files (.zip extension), use an unzipping utility found on most PCs, for example, WinZip. Alternatively, open-source zipping/unzipping software can be downloaded from the Internet, for example, 7-Zip.
When you unzip the files, the data will be extracted as CSV files, which are ready to use. For example, unzipping AddressBase Plus will extract files similar to the chunks below:
AddressBasePlus_FULL_2020-01-21_001.csv
AddressBasePlus_ISL_FULL_2020-01-21_001.csv
AddressBasePlus_2011-07-29_NC4040.csv
This getting started guide provides instructions for using AddressBase in different software applications. Users with limited technical knowledge will be able to follow this guide.
These instructions show you how to get started with , and .
AddressBase products are created by bringing together different address sources:
Local Authority Gazetteers across Great Britain, Northern Ireland, the Channel Islands and the Isle of Man
Royal Mail PAF data
References to Valuation Office Agency (VOA) data
Additional addresses and coordinates from Ordnance Survey
The data is supplied as comma-separated values (CSV) or Geography Markup Language (GML).
This getting started guide shows you how to obtain a data supply, load and work with AddressBase data. It includes the following sections:
All the AddressBase products are available as a full supply or a COU. A COU means you will only be supplied with the features which have changed since your last supply. The following sections provide guidance on how you could potentially manage a COU supply of AddressBase and AddressBase Plus data.
If you receive a tile supply, you will receive Change Chunks. This means if a record within your tile has changed, all of the records in that tile will be provided to you as inserts, and no updates or deletes will be issued.
Tiles are only available for GB supplies, so this does not apply to AddressBase Plus Islands.
At a high-level, there are three types of change found within a COU:
Deletes (CHANGE_TYPE ‘D’) are objects that have ceased to exist in your AOI since the last product refresh.
Inserts (CHANGE_TYPE ‘I’) are objects that have been newly inserted into your AOI since the last product refresh.
Updates (CHANGE_TYPE ‘U’) are objects that have been updated in your AOI since the last product refresh.
The diagram below shows how to implement an AddressBase, AddressBase Plus and AddressBase Plus Islands COU within a database.
Before a COU is applied, there may be a business requirement to archive existing address records. The table below shows how to implement archiving with an AddressBase COU within a database.
Within AddressBase and AddressBase Plus there will be no records with the same UPRN. This can be tested by checking the number of records that have the same UPRN. The following SQL code would notify you of any duplicates:
This query should return 0 rows, and this confirms that there are no duplicates. As there are no duplicate records, we can use the UPRN to apply the COU.
Once confirmed, the following steps can be taken to apply the COU (without archiving):
Initially delete the existing records that will be updated and deleted:
Insert the new updated records and the new inserted records:
Where there is a business requirement to keep the records that are being Updated and Deleted in a separate archive table, the following SQL will create an Archive Table. It will populate with records that are being Updated and Deleted from the live AddressBase or AddressBase Plus table.
The following command creates an archive table of the records that are being updated and deleted from the existing table.
If this table already exists, you can simply use INSERT INTO rather than CREATE TABLE.
The following command then deletes the records from the existing table, which are either updates or deletions:
The following command then inserts the new insert records and the new updated records into the live table:
The AddressBase products provide a variety of data fields, allowing you to construct different forms of an address for a given addressable object, dependent on how the address is to be used.
AddressBase contains the Delivery Point Address which is sourced from Royal Mail’s Postcode Address File (PAF) – a non-geocoded list of addresses. These addresses are used primarily as a ‘mailing list’ for postal purposes.
There are two types of address contained in the AddressBase Plus products:
Delivery Point Address
Geographic Address
These two address types come from different sources and are matched together by GeoPlace.
As noted above, the Delivery Point Address is sourced from Royal Mail’s PAF data. Geographic Addresses are maintained by contributing Local Authorities. The structure of a Geographic Address is based on the British Standard BS7666. These addresses are used to provide an accurate geographic locator for an object to support, for example, service delivery, asset management, or command and control operations. They also represent the legal form of addresses as created under street naming and numbering legislation.
Each UPRN in AddressBase Plus provides the Geographic Address and, where matched, the Delivery Point Address in a one-to-one relationship. If there is no match, then the following fields will be left empty:
DEPARTMENT_NAME |
---|
A common requirement for customers using the AddressBase products is to build a single address label from core address elements.
There are two types of address label. The simplest is a full address on a single line with different elements separated by commas and spaces. This type of label is suited for displaying a full address within a tabular display, such as within an on-screen data grid or spreadsheet, or where a single-line printed address is most appropriate (such as within the text, header or footer of a letter):
ROSE COTTAGE, 5 MAIN STREET, ADDRESSVILLE, LONDON, SE99 9EX
The other type of formatted address is a multi-line address label. These are most often used on envelopes or at the tops of letters, where different parts of an address are separated onto different lines:
The rules in this guide are suggestions only and can be used for visual display of full addresses. It is strongly recommended that address components are stored in the format in which they are provided in order to allow maximum flexibility of use and derived value.
A Delivery Point Address contains information sourced from Royal Mail (PAF). Stringent rules are used to match these addresses to the Geographic Address and assign a common UPRN to link addresses from the two addressing sources together in the data model.
To construct a single address label based purely on the Royal Mail PAF address fields, the following attributes can be used to build a Delivery Point Address label.
The table below provides details of the Delivery Point Address Components.
These address components are listed in the correct order in which they should appear on an address label. There may be a business need to replace the thoroughfare, locality and post_town attributes with the Welsh equivalent. The following examples use the English version of these attributes only.
It should be noted that most of the PAF fields are optional and may contain null values (or zero, in the cases of ‘BUILDING NUMBER’ and ‘PO BOX NUMBER’). In these cases, those fields should be omitted.
The following (entirely fictional) example shows all of the PAF fields filled in (apart from the PO Box number) and indicates how they should be ordered in a single address label.
In cases where a PO BOX NUMBER is present, it will only be described in the data as an integer. In order to properly format these addresses when generating an address label, these integers should be prefixed with the text ‘PO BOX’, as shown in the following example:
Where null or empty, string values exist (for character fields) or zeros or nulls (for integer fields), those fields should be entirely omitted from the output. However, the order in which the fields should be concatenated always remains the same.
Building a single-line, formatted address for a Delivery Point is relatively straightforward. All the fields should be checked in the order shown previously in Table 1, and those that have values should be concatenated together into a single line. Generally, address components should be separated by a comma followed by a single space (‘, ’), although sometimes only a space is used between a building number and a thoroughfare name. You can use your preference.
The SQL operator for concatenating text is a double pipe (‘||’).
CASE blocks have been used to test each of the fields for null values before concatenating its contents (along with a suitable separator – either ‘, ’ or ‘ ’).
The field names and table names used are illustrative and may vary between databases.
Depending on the database schema and data loading method used, it may be necessary to test some fields for empty strings (‘’) or zero values (for integer fields) instead of, or as well as, testing for NULLs.
If you are using PostGres (PostGIS), it might be beneficial to substitute the ‘IS NOT NULL’ with != ‘’. This should improve the overall appearance of the output.
Splitting a Delivery Point Address into multiple lines is more complicated. There are several rules to consider in order to avoid having very short lines (for example, just a building number) or very long lines within the formatted address. A summary of these rules is as follows:
Generally, if there is a building number, it should appear on the same line as the thoroughfare (or dependent thoroughfare) name. If there is no thoroughfare name information, it should appear on the same line as the first locality name.
In cases where building numbers have been placed in the building name field due to the presence of a letter suffix (for example, ‘11A’) or a number range separator (for example, ‘3–5’), these should be detected and placed on the same line as the thoroughfare name in the same way as a building number (or on the first locality line if no thoroughfare name is present).
In most other cases, the building name, if present, should appear on a separate line above the thoroughfare (or dependent thoroughfare) name. If there is no thoroughfare name present, it should appear on the same line as the first locality name.
Similar tests should be applied to the SUB_BUILDING_NAME field: if this field contains a number, a number with a suffix, or a numeric range, it should precede the building name on the same line. In most other cases, it should appear on a separate line above the building name.
The structure of a Geographic Address is based on the British Standard BS7666 and is split into several components. This means that in order to construct a complete address label (for example, on an envelope, database form or GIS display), the components need to be constructed according to a set of rules.
Within the AddressBase Plus products, the core property level address information is stored within the Primary Addressable Object (PAO) and Secondary Addressable Object (SAO) fields. The additional attribution required to build a full address label are the la_organisation
, street_description
, locality
, town_name
, administrative_area
and postcode_locator
.
For a full description of PAOs and SAOs, and the complete set of AddressBase Plus fields, please refer to the Technical Specification on your respective product:
To construct a single address label based purely on the BS7666 address fields, the following attributes should be used to build a Geographic Address label.
*ADMINISTRATIVE_AREA is optional because it is common for this field to be the same as the TOWN_NAME. Sometimes, however, this field will help users construct a more complete address.
These address components are listed in the correct order in which they should appear on an address label. There may be a business need to use alternate language fields for the SAO_TEXT, PAO_TEXT and STREET_DESCRIPTION, which are also listed in the correct order above.
When building a single address label, it may be necessary to concatenate the various SAO fields and PAO fields together respectively. These fields contain any property names, numbers, number ranges or suffixes that apply to an address.
A PAO number/range string should be constructed from the PAO_START_NUMBER, PAO_START_SUFFIX, PAO_ END_NUMBER and PAO_END_SUFFIX fields, as illustrated in the following table.
Similarly, a SAO number/range string should be constructed from the SAO_START_NUMBER, SAO_START_SUFFIX, SAO_END_NUMBER and SAO_END_SUFFIX fields, as illustrated in the following table.
In addition to the numeric range fields described above, there are also PAO_text and SAO_text fields. These fields may be populated instead of, or as well as, the numeric range fields. In both cases, if both text and a numeric range string are present, the text should appear before the numeric range in any formatted address.
For PAOs, there will always be either a text entry or a numeric/range entry, or both. This is not the case for SAOs, which may be entirely absent for a given address.
The street description and administrative area names are always present, while the locality name and town name may be empty.
The ADMINISTRATIVE_AREA field always contains a value; however, this value will not always enhance an address, but in some cases it will. In particular, check that it is not the same as the value in the TOWN_NAME field, as this is often the case.
In other cases, the administrative area name will simply contain the local authority name, which would not traditionally form part of a single or multi-line address but can be included to add additional information to an address label. Its inclusion is largely down to business requirements or personal preference; however, it may also be useful to 'de-duplicate' some Geographic Addresses.
The following (entirely fictional) example shows all of the BS7666 Geographic Address fields filled in and how they should be ordered in a single address label.
*The number/range strings are built from the relevant PAO/SAO START_NUMBER, START_SUFFIX, END_NUMBER and END_SUFFIX fields, as described above, and formatted as character strings.
Where an administrative area matches the town name, it should always be omitted.
Where null or empty string values exist (for character fields) or zeros or nulls (for integer fields), those fields should be entirely omitted from the output; however, the order in which the fields should be concatenated always remains the same.
The SQL operator for concatenating text is a double pipe (‘||’).
CASE blocks have been used to test each of the fields for null values before concatenating its contents (along with a suitable separator – either ‘, ’ or ‘ ’).
The field names and table names used are illustrative and may vary between databases.
Depending on the database schema and data loading method used, it may be necessary to test some fields for empty strings (‘’) or zero values (for integer fields) instead of or as well as testing for NULLs.
Splitting a Geographic Address into multiple lines is more complex. As with Delivery Point Addresses, there are several rules to consider in order to avoid having very short lines (for example, just a building number) or very long lines within the formatted address.
A summary of these rules is as follows:
Generally, if there is a PAO number/range string, it should appear on the same line as the Street Description. For example: 11A MAIN STREET
If there is a PAO_text value, it should always appear on the line above the Street Name (or on the line above the <PAO number string> + <Street Name> where there is a PAO number/range).
If there is a SAO_text value, it should appear on a separate line above the PAO_text line (or the PAO number/range + Street Name where there is no PAO_text value).
If there is a SAO number/range value, it should be inserted either on the same line as the PAO_text (if there is a PAO_text value), or on the same line as the PAO number/range + Street Name (if there is only a PAO number/range value and no PAO_text value). If there are both PAO_text and a PAO number/range, then the SAO number/range should appear on the same line as the PAO_text, and the PAO number/range should appear on the street line.
If there is a SAO_text value, it should always appear on its own line.
If there is an Organisation Name, it should always appear alone as the top line of the address.
The Locality (if present) should appear on a separate line beneath the Street Description, followed by the Town Name on the line below it. If there is no Locality, the Town Name should appear alone on the line beneath the Street Description.
If the Administrative Area name is required and it is not a duplicate of the Town Name, it can optionally be included on a separate line beneath the Town Name.
Finally, the Postcode Locator should be inserted on the final line of the address.
Given that AddressBase Plus contains two different types of address, a decision needs to be made as to whether to use the Geographic or Delivery Point Addresses, or a mixture.
The following two options should be considered:
Use Delivery Point Addresses whenever they are available, and when they are not, use a Geographic Address.
Use Geographic Addresses in all cases.
Depending on business requirements, in some user interfaces it may be worth considering displaying both forms of an address where possible, since this will provide the maximum information available about a given UPRN.
‘Mixing and matching’ components from the two different forms of address into a single address label is not recommended as this is likely to cause confusion in some instances.
AddressBase Plus offers other attributes that could be used in conjunction with address labels. For example, classification can be used to target certain types of property, or OS MasterMap Topography TOID cross references can be used to link address labels to Topographic objects and viewed in a GIS.
TOID cross references are not available in AddressBase Plus Islands.
A common requirement for customers using the AddressBase products is to search for properties using full or partial addresses. Address searches may return a large number of addresses, a short list of possibilities, a single match or no results, depending on the search criteria.
There are many methods of implementing an address search, from free text queries through to structured address component searches. This guide will step through two such approaches that may be used when working with AddressBase and/or AddressBase Plus.
These methods are not intended as recommendations; they are merely examples of how to get maximum value out of the product when implementing an address search function.
One type of search implementation involves a single ‘search engine’ style text box, into which a user can type all or some of an address. For example:
Find address | Results |
---|
In this scenario, the user can choose to type anything in Find address, which may be just one component of an address (for example, a postcode, street name or building name), several parts of an address (for example, street name + town name, house name + postcode, etc.) or even (rarely) a complete address.
There may or may not be commas between search items, or address components can be entered with or without capitalised letters, etc. In short, with this search method, there is no structure to the user input and the search methodology must be designed with this in mind.
The other common type of implementation for address searches involves entering search criteria in a structured way (for example, with a different text box for each major address component).
This method guides the user to enter known components of an address and creates a predictable user input structure around which to build a search function. While generally simpler to use and implement, it can be less user-friendly, particularly in cases where it is not obvious which box to type an address component into, for example, is Richmond Terrace a building name or a street?
This guide suggests how to implement the two search methods described above. Both should be used alongside the instructions on formatting single address labels.
The methods described here may be adapted to work with both AddressBase Plus, AddressBase Plus Islands and AddressBase; however, in the case of AddressBase, only Delivery Point Addresses are searchable, so the geographic guidance will not apply to this product.
An address search operation typically requires two stages of interaction from a user and several processing steps from the underlying IT system. These steps can be summarised in the following diagram:
The second user interaction can be omitted if there is only one result returned from the query. In almost all cases, there should be an option to ‘search again’ at the second and third stages in case no results are returned, or if none of the options shown is the required address.
Of course, different applications require different approaches; however, the general principles of the above process apply in all cases where an address is searched for based on user-entered criteria.
Within an interface that accepts structured user input for an address search, it is necessary to ‘map’ the fields presented to the user with those found within AddressBase or AddressBase Plus. In particular, any query will need to test multiple fields for a given input and will need to combine result sets from the two different address formats of AddressBase Plus (or the single address format of AddressBase) in order to produce the most complete result set.
Generally, a search form will describe a simplified view of an address in order to keep the user interface tidy and intuitive. Users may be given a set of text boxes to fill in, generally including building name, building number, street name, locality name, town name and postcode. The relationships between some common search fields and the fields found in AddressBase Plus are as follows:
The above mapping is an example only, and it is possible to breakdown the search fields differently, in which case, a different mapping would be required. The important thing is to consider all possibilities for how data might be recorded. For example, a business name can sometimes appear as an organisation name or a building/PAO name depending on circumstances, so both must be checked when creating a search query.
Numbers need to be handled very carefully due to the presence of suffixes and ranges. There are two options for structuring the search input in these cases:
A single ‘number’ box can be used (as shown above in Flat/Subdivision Number and Building Number), which will then require some string manipulation to split the input into the appropriate numeric range and suffix components in order to search the geographic addresses; or
Four boxes can be provided for each number (start number, start suffix, end number and end suffix), which would then need to be combined into an appropriate string to search the Delivery Point Addresses.
The basic rules to adhere to when generating a search query from structured input are as follows:
Ignore any search boxes that are not filled in with values.
Where a value is entered, assume that a match on at least one of the mapped fields is essential.
In SQL query terms, this means that each search term should generate a sub-query that searches each of the mapped fields (using OR), and that these sub-queries should then be combined together (using AND) into a single search query. The following SQL code illustrates this (for the Delivery Point Address search only) for an example where a street, locality and town name have been entered by the user:
On top of this, for a complete query, the two different types of addresses should be queried separately (Geographic and Delivery Point Addresses), and the two result sets should be amalgamated into a single set using a UNION. The following example builds upon the previous example to include Geographic Addresses as well as Delivery Point Addresses.
The SQL UNION
operator will combine the two result sets, discarding any exact duplicates. (Retaining the exact duplicates requires the use of UNION ALL
, but that is not desirable in this example.)
The resulting output from this query will be a set of search results as formatted addresses along with their UPRN. Exact duplicates will be omitted, but all ‘variations’ of the same address will be output (one row for each variation, with the same UPRN repeated more than once potentially). It may be wise to return the Postal Address Flag values against each to enable further filtering, for example, to restrict the results to postal addresses only. Note that the Postal Address Flag is only available in AddressBase Plus. All records in AddressBase are deemed postal as they are from Royal Mail’s PAF data.
A flaw in the above examples is the use of equality operators. In practice, because people do not tend to be consistent with capitalisation of letters, the SQL ‘LIKE’ operator might work better, and depending on the nature of the application, a ‘%’ wildcard could be appended to the end of each search term to allow only the first few letters of an address component to be entered. For example:
Alternatively, if exact matches are required but case sensitivity is not, then the UPPER() or LOWER() SQL functions can be used on each side of the equals sign in comparisons (a solution that should work in all databases):
Finally, to combine all of the approaches, the following would work for maximum flexibility:
When offering a ‘search engine’ style search feature with just a single text box to enter search terms, a wholly different approach is required. No assumptions can be made about the order, format or style of the user input, and the data will need to be ‘indexed’ in a way that facilitates searches of this type.
Search engine style searches are likely to require the creation of an additional index/lookup table for addresses. Such a table is likely to consist of just two main columns: a key value (UPRN) and a formatted address string. Additional columns may be required to allow filtering of results (such as the AddressBase Postal flag values from AddressBase Plus, which would allow the results to be filtered by different address statuses).
The following table shows a possible address index table structure:
Note how the addresses have been formatted as a single text string with a single space between each word (although leaving commas in would do no harm). All forms of each address (both PAF and geographic) have been added to the index, so there can be several rows with the same UPRN. To speed up complex searching, an appropriate index could be added to the Address Text field, such as a full text search index.
Once a suitable search index is in place, the query itself can be put together. The basic idea is to split the user input into search terms by removing commas, double spaces, and other unnecessary whitespace and then splitting it at each single space, as follows:
User input: 4, High Street, westville, wv17
Capitalised, with commas and double-spaces removed:
4 HIGH STREET WESTVILLE WV17
Split into separate search terms:
4
HIGH
STREET
WESTVILLE
WV17
Once the user input has been pre-processed into separate search terms, a query can be generated. The key assumption in this example will be that ALL search terms must be matched against the index table to be considered as a result. This implies a query where each value is matched using an ‘AND’ operator. In order to search the whole index, the ‘LIKE’ operator will need to be used along with a ‘%’ wildcard on either side of the search text. A suitable search query for the above example would be as follows:
This query would return all rows from the index table that contain all of the search terms, along with the appropriate UPRNs. The following table shows how the index table would be used in the above example to return relevant results:
This result set can then be presented to the user, who can select the most appropriate record, which can then be retrieved in full using the UPRN.
Of course, in a practical implementation, the above query would need to be dynamically generated, with a separate condition added for each search term. This example is quite a strict search query that requires all search terms to be present. Many layers of complexity could be added to allow partial and ‘fuzzy’ matches, and to return confidence scores, for example, but such enhancements are beyond the scope of this guide.
This guide is intended as an introduction to implementing address search functionality using AddressBase, AddressBase Plus and AddressBase Plus Islands. The following list is a summary of the main points:
A user front-end for an address search may contain a single, search engine style text box or multiple text boxes representing different parts of an address.
A typical address search function takes place in three stages:
A user enters search text.
A query is run, returning a set of possible matches.
The user selects the address of interest and the full record is then returned.
With a structured search interface, the addresses can be queried directly by mapping the various address fields to the text boxes supplied.
For an unstructured (single text box) interface, it is necessary to create an index table with fully formatted address strings against each UPRN. Queries can then be run against this index table by splitting the user input into individual search terms and requiring them all to be present.
It is possible to filter results by status in AddressBase Plus (for example, postal or non-postal).
Any search function should search all forms of an address (both Geographic and Delivery Point Addresses).
Careful consideration should be given to the use of ‘fuzzy’ search algorithms (such as using wildcard or sound-alike searches).
Change the Coordinate System to British_National_Grid by clicking the globe icon .
Click the edit icon .
With a Secure File Transfer Protocol (SFTP) order, the same folder structure is supplied as in . The filenames will be slightly different, reflecting the SFTP order number, and the Docs folder will be empty.
When you have placed an order for a product, the data will become available as a series of zipped data files. To unzip these files, please refer to .
This guide outlines a methodology for structuring and layering a single address label, providing suggested logic to build both the Delivery Point and Geographic Address. The logic in is applicable to AddressBase, AddressBase Plus and AddressBase Plus Islands. The logic in is only applicable to AddressBase Plus and AddressBase Plus Islands.
Delivery Point Address Component | Type |
---|
Delivery Point Address Component | Example |
---|
Delivery Point Address Component | Data Content | Formatted output |
---|
Delivery Point Address Component | Data content | Formatted output |
---|
An example of SQL logic to create a single-line Delivery Point Address is on our GitHub repository which should be used under the following considerations:
Geographic Address Component | Type |
---|
Attribute | Example 1 | Example 2 | Example 3 | Example 4 |
---|
Attribute | Example 1 | Example 2 | Example 3 | Example 4 |
---|
Administrative area not included | vs | Administrative area included (BURY) |
---|
Geographic Address Component | Example |
---|
Delivery Point Address Component | Data content | Formatted output |
---|
Delivery Point Address Component | Data content | Formatted output |
---|
Building a single-line, formatted address for a Geographic Address is slightly more complicated than for a Delivery Point Address due to the need to preformat the SAO and PAO number/range strings. However, once this is done, the process is largely the same as before: the calculated fields should be checked in the order shown previously in , and those that have values should be concatenated together into a single line. Generally, address components should be separated by a comma followed by a single space (‘, ’), although sometimes only a space is used between a PAO number/range string and a street description. This is down to personal preference.
Example SQL logic to create a single-line Geographic Address can be found , which should be used under the following considerations:
Results |
---|
Search Box | Mapped Delivery Point fields | Mapped geographic fields |
---|
In the above example, streetsearchtext
, localitysearchtext
, and townsearchtext
represent user- entered search terms (which could be parameters within an SQL function) and the GetFormattedAddress(*)
function is a hypothetical user-defined function that returns the formatted address as a single string (suitable for display in the user interface). For more information on formatting addresses, please see .
UPRN | Address Text | Statuses (multiple fields) |
---|
Address text | Statuses (multiple fields) |
---|
ORGANISATION_NAME |
SUB_BUILDING_NAME |
BUILDING_NAME |
BUILDING_NUMBER |
PO_BOX_NUMBER |
DEPENDENT_THOROUGHFARE (and WELSH_DEPENDENT_THOROUGHFARE) |
THOROUGHFARE (and WELSH_THOROUGHFARE) |
DOUBLE_DEPENDENT_LOCALITY (and WELSH_DOUBLE_DEPENDENT_LOCALITY) |
DEPENDENT_LOCALITY (and WELSH_DEPENDENT_LOCALITY) |
POST_TOWN (and WELSH_POST_TOWN) |
POSTCODE |
ROSE COTTAGE 5 MAIN STREET ADDRESSVILLE LONDON SE99 9EX |
DEPARTMENT_NAME | Character |
ORGANISATION_NAME | Character |
SUB_BUILDING_NAME | Character |
BUILDING_NAME | Character |
BUILDING_NUMBER | Integer |
PO_BOX_NUMBER | Integer |
DEPENDENT_THOROUGHFARE (or WELSH_DEPENDENT_THOROUGHFARE) | Character |
THOROUGHFARE (or WELSH_THOROUGHFARE) | Character |
DOUBLE_DEPENDENT_LOCALITY (or WELSH_DOUBLE_DEPENDENT_LOCALITY) | Character |
DEPENDENT_LOCALITY (or WELSH_DEPENDENT_LOCALITY) | Character |
POST_TOWN (or WELSH_POST_TOWN) | Character |
POSTCODE | Character |
DEPARTMENT_NAME | CUSTOMER SERVICE DEPARTMENT |
ORGANISATION_NAME | JW SIMPSON LTD. |
SUB_BUILDING_NAME | UNIT 3 |
BUILDING_NAME | THE OLD FORGE 7 |
BUILDING_NUMBER | 7 |
PO_BOX_NUMBER |
DEPENDENT_THOROUGHFARE | RICHMOND TERRACE |
THOROUGHFARE | MAIN STREET |
DOUBLE_DEPENDENT_LOCALITY | HOOK |
DEPENDENT_LOCALITY | WARSASH |
POST_TOWN | SOUTHAMPTON |
POSTCODE | SO99 9ZZ |
ORGANISATION_NAME | ‘JWS CONSULTING’ | JWS CONSULTING |
PO_BOX_NUMBER | 5422 | PO BOX 5422 |
THOROUGHFARE | ‘HIGH STREET’ | HIGH STREET |
POST_TOWN | 'SPRINGFIELD’ | SPRINGFIELD |
POSTCODE | ‘SP77 0SF’ | SP77 0SF |
DEPARTMENT_NAME | Null |
ORGANISATION_NAME | ‘TM MOTORS’ | TM MOTORS |
SUB_BUILDING_NAME | Null |
BUILDING_NAME | ‘THE OLD BARN’ | THE OLD BARN |
BUILDING_NUMBER | 0 (or null) |
PO_BOX_NUMBER | 0 (or null) |
DEPENDENT_THOROUGHFARE | Null |
THOROUGHFARE | ‘HORSHAM LANE’ | HORSHAM LANE |
DOUBLE_DEPENDENT_LOCALITY | Null |
DEPENDENT_LOCALITY | Null |
POST_TOWN | ‘HORSHAM’ | HORSHAM |
POSTCODE | ‘RH12 1EQ’ | RH12 1EQ |
LA_ORGANISATION | Character |
SAO_TEXT (or ALT_LANGUAGE_SAO_TEXT) | Character |
SAO_START_NUMBER | Integer |
SAO_START_SUFFIX | Character |
SAO_END_NUMBER | Integer |
SAO_END_SUFFIX | Character |
PAO_TEXT (or ALT_LANGUAGE_PAO_TEXT) | Character |
PAO_START_NUMBER | Integer |
PAO_START_SUFFIX | Character |
PAO_END_NUMBER | Integer |
PAO_END_SUFFIX | Character |
STREET_DESCRIPTION (or ALT_LANGUAGE_STREET_DESCRIPTION) | Character |
LOCALITY | Character |
TOWN_NAME | Character |
ADMINISTRATIVE_AREA* | Character |
POSTCODE_LOCATOR | Character |
PAO_START_NUMBER PAO_START_SUFFIX PAO_END_NUMBER PAO_END_SUFFIX | 1 | 1 A | 1 5 | 1 A 5 C |
Rendered PAO range | 1 | 1A | 1-5 | 1A-5C |
PAO (number string) PAO (text) | 1 | 1A | 1A Rose Cottage | Rose Cottage |
Rendered PAO (showing street name location) | 1 <street> | 1A <street> | Rose Cottage, 1A <street> | Rose Cottage, <street> |
34, CROW LANE, RAMSBOTTOM, BL0 9BR | 34, CROW LANE, RAMSBOTTOM, BURY, BL0 9BR |
LA_ORGANISATION SAO_TEXT SAO (number/range string)* PAO_TEXT PAO (number/range string)* STREET_DESCRIPTION LOCALITY TOWN_NAME ADMINISTRATIVE_AREA POSTCODE_LOCATOR | JW SIMPSON LTD THE ANNEXE 1A THE OLD MILL 7–9 MAIN STREET HOOK WARSASH SOUTHAMPTON SO99 9ZZ |
PAO_TEXT | ‘HIGHBURY HOUSE’ | HIGHBURY HOUSE |
STREET_DESCRIPTION | ‘HIGH STREET’ | HIGH STREET |
TOWN_NAME | ‘SOUTHAMPTON’ | SOUTHAMPTON |
ADMINISTRATIVE_AREA | ‘SOUTHAMPTON’ |
POSTCODE_LOCATOR | ‘SO77 0SF’ | SO77 0SF |
ORGANISATION | ‘TM MOTORS’ | TM MOTORS |
SAO_TEXT | null |
SAO (number/range string)* | null |
PAO_TEXT | ‘THE OLD BARN’ | THE OLD BARN |
PAO (number/range string)* | ‘1’ | 1 |
STREET_DESCRIPTION | ‘HORSHAM LANE’ | HORSHAM LANE |
LOCALITY | null |
TOWN_NAME | ‘HORSHAM’ | HORSHAM |
ADMINISTRATIVE_AREA | ‘HORSHAM’ | * Duplicate name omitted |
POSTCODE_LOCATOR | ‘RH12 1EQ’ | RH12 1EQ |
PAO_text only | PAO_text and PAO number or range |
ROSE COTTAGE, MAIN STREET | ROSE COTTAGE, 11A MAIN STREET |
SAO_text value only, with PAO_text value only | SAO_text value only, with PAO number/range only |
THE ANNEXE, ROSE COURT, MAIN STREET | THE ANNEXE, 11A MAIN STREET |
SAO number/range value only, and PAO_text value only | SAO number/range value only, and PAO number/range value only | SAO number/range value only, and both PAO_text and PAO number/range values |
1A ROSE COURT, MAIN STREET | 1-3, 11A MAIN STREET | 1A ROSE COURT, 11A MAIN STREET |
SAO_text value only with PAO_text only | SAO_text and SAO number/range and PAO_text and PAO number/range |
THE ANNEXE, ROSE COTTAGE, MAIN STREET | WARDEN’S FLAT, 1A ROSE COURT, 11A MAIN STREET |
Organisation Name along with all PAO + SAO fields |
COTTAGE INDUSTRY LTD, THE ANNEXE, 1A ROSE COURT, 11A MAIN STREET |
Locality and Town Name present | Town Name only |
[first part of address, formatted as described above] MAIN STREET, HIGHFIELD, SOUTHAMPTON | [first part of address, formatted as described above] HIGH STREET, SOUTHAMPTON |
Administrative Area name included |
[first part of address, formatted as described above] MAIN STREET, WINDSOR, ROYAL BOROUGH OF WINDSOR AND MAIDENHEAD |
With Postcode_Locator on final line |
[first part of address, formatted as described above] HIGH STREET, MILTON, ML99 0WW |
Rose Cottage, Main Street, Fieldtown, Addressville, SW99 9ZZ Rose Cottage, Main Street, Ashford, AS45 9PP Rose Cottage, Main Street, Buxtew, Monley, MO88 4TY And so on... |
Business Name | Organisation_Name | Organisation PAO_Text SAO_Text |
Flat/Subdivision Name | Sub_Building_Name Department_Name | SAO_Text |
Flat/Subdivision Number | Sub_Building_Name | SAO_StartNumber SAO_StartSuffix SAO_EndNumber SAO_EndSuffix |
Building Name | Building_Name | PAO_Text |
Building Number | Building_Number Building_Name (in cases where a suffix or range is present) | PAO_StartNumber PAO_StartSuffix PAO_EndNumber PAO_EndSuffix |
Street | Thoroughfare Dependent_Thoroughfare | Street PAO_Text |
Locality | Dependent_Locality Double_Dependent_Locality | Locality Town Street |
Town | Dependent_Locality Post_Town | Town Locality |
Postcode | Postcode | Postcode_Locator |
123456789012 | 4 THE MEADOWS HIGH STREET WALTHAMSDALE BURRIDGE BU27 9UB | Local Authority |
123456789012 | FLAT 4 THE MEADOWS HIGH STREET WALTHAMSDALE BURRIDGE BU27 9UB | PAF |
123456789013 | 4 HIGH STREET WALTHAMSDALE BURRIDGE BU27 9UB | Non-postal |
894756389092 | 4 HIGH STREET WESTVILLE SUNNYTOWN WV17 7HL | Geographic + PAF |
894756389132 | ROSE COTTAGE 4 HIGH STREET WESTVILLE SUNNYTOWN WV17 7HL | Geographic |
274859037849 | FLAT 4 HIGHBURY COURT HIGH STREET WESTVILLE SUNNYTOWN WV17 7HL | Geographic + PAF |
482974769830 | MAPS4U LTD HIGH STREET WESTVILLE SUNNYTOWN WV17 7HL | Geographic + PAF |
CLOVER AVENUE, SW99 9ZZ | 1, Clover Avenue, Fieldtown, Addressville, SW99 9ZZ 2, Clover Avenue, Fieldtown, Addressville, SW99 9ZZ 3, Clover Avenue, Fieldtown, Addressville, SW99 9ZZ 4, Clover Avenue, Fieldtown, Addressville, SW99 9ZZ 5, Clover Avenue, Fieldtown, Addressville, SW99 9ZZ 6, Clover Avenue, Fieldtown, Addressville, SW99 9ZZ 7, Clover Avenue, Fieldtown, Addressville, SW99 9ZZ |