As Code-Point Open data is supplied in separate .csv files by two-digit grid references, the files will require some processing in order to proceed.
The Great Britain .csv files should be combined with the Code-Point_Open_Column_Headers.csv. To do this:
Before processing the data, create the following folder:
Great Britain
Then, copy all the .csv files from the CSV folder for Great Britain into this folder (120 in total).
Add the Code-Point_Open_Column_Headers.csv (located in the Doc folder) into the Great Britain folder and prefix the name with: aa_
The following is an example of one way to combine all the individual .csv files into a single file by using a .bat batch file.
The .csv file that is created comes to approximately 150MB, and 1.7 million records. If this is opened in Microsoft Excel, only the first million records can be seen as the software cannot display more than this number of records.
To use the batch function:
Copy the following text and paste it into a new Notepad document: copy *.csv outputfile.csv
Save the Notepad document with the file extension .bat (for example, combine_csv.bat) in the Great Britain folder containing the 120 Great Britain .csv files.
Close the .bat file and navigate to the Great Britain folder, where it was saved. Double-click on the .bat file and an MS-DOS window will appear. Once the process is complete, the MS-DOS screen will close automatically.
A new CSV file with the name outputfile.csv has now been created within the Great Britain folder. All the .csv files apart from outputfile.csv can now be deleted as they have all been copied into the one ouputfile.csv.
The outputfile.csv will have two column headers at rows 1 and 2 – an abbreviated version and a full version.
Open the .csv file in a text editor (not Excel) and delete the column header that you don’t want to use, ensuring that any resulting empty row is closed.
You will now have an output file ready for uploading into geospatial software (such as QGIS).
This technical specification provides detailed technical information about Code-Point Open. It is targeted at technical users and software developers.
Code-Point Open locates over 1.7 million postcode units for Great Britain, each having a notional geographical location. Postcodes are an alphanumeric abbreviated form of an address. Postcode units are unique references and identify an average of 15 addresses. In some cases, where an address receives a substantial amount of mail, a postcode will apply to only one address and is defined as a large-user postcode. The maximum number of addresses in a postcode is 100.
This getting started guide focusses on using the product in comma-separated values (CSV) format.
Code-Point Open is a data product and does not include software for analysis but can be used with a variety of programs. Code-Point Open can be loaded into a GIS (geographical information system) for display and analysis of the data. Consult your GIS documentation to establish actual system requirements.
Code-Point Open is only available as national cover of Great Britain. The product is supplied in two formats (CSV and GeoPackage) as an online download from the OS Data Hub.
Updates are supplied quarterly (February, May, August and November) and provided as a complete resupply. Any postcode that is deleted between supplies will not be included.
Code-Point Open covers postcodes for Great Britain. In CSV format, postcodes are divided into postcode areas and supplied as 120 files. GeoPackage format is a self-contained database.
The approximate file sizes of the respective data formats are as follows:
CSV: 153MB
GeoPackage: 257MB
The Code-Point Open CSV format contains two folders in the root directory: Doc and Data The Doc folder contains the following files:
Codelist.xlsx – Lookup table of Government Statistical Service (GSS) codes.
Code-Point_Open_Column_Headers.csv – Description of column headers.
licence.txt – Important licence information.
metadata.txt – Number of postcode units in each postcode area.
NHS_Codelist.xls – Lookup table of health GSS codes.
readme.txt – Summary of supplied data.
The Data folder contains the following sub-folder:
CSV – 120 postcode area files in CSV format.
The Code-Point Open GeoPackage format contains the following text file in the root directory:
Readme.txt – Summary of supplied data And two folders: Doc and Data.
The Doc folder contains the following files:
Codelist.xlsx – Lookup table of GSS codes.
Code-Point_Open_Column_Headers.csv – Description of column headers.
Licence.txt – Important licence information.
Metadata.txt – Number of postcode units in each postcode area.
NHS_Codelist.xls – Lookup table of health GSS codes The Data folder contains the following file:
Codepo_gb.gpkg – One postcode area file in GeoPackage format.
The structure of Code-Point Open supplied in CSV is described in the product's Technical Specification.
The structure of Code-Point Open supplied in GeoPackage is described in the product's Technical Specification.
For guidance on using the product in GeoPackage format, please see theGetting started with GeoPackage guide.
Glossary term | Definition |
---|
The data can be loaded into several geographic information systems (GIS). This section describes how to load the combined CSV output file (for more information on how to combine multiple CSV files, see Section 3.1) into four commonly used GIS, including:
QGIS
ArcGIS Pro
ArcMap
MapInfo Pro 2019
addressed premise | A permanent or non-permanent building structure with an address being a potential delivery point for Royal Mail. Examples of an addressed premise would be a house, a flat within a block of flats, a caravan site, a bollard to which several houseboats may be moored, or an organisation occupying the whole building. |
building | A physical, walled structure connected to foundations that has, or will have, a roof. This definition includes buildings surveyed at foundation stage. |
CPLC (Code-Point location coordinate) | A National Grid reference for each postcode unit. It is a two-dimensional coordinated point to a resolution of 1 metre. Coordinates are attributed from Gridlink using an accuracy hierarchy. |
Country code | The code used by the Office of National Statistics to indicate the country in which the Code-Point georeference lies. This has replaced the PAF update date field. |
Country Code | England: E92000001 Scotland: S92000003 Wales: W92000004 N Ireland: N92000002 |
Comma-separated values (CSV) | The CSV file format is commonly used to exchange data between different applications, for example, Microsoft Excel and Access. Being text files, CSV files can also be viewed in Notepad. |
delivery point | A Royal Mail-defined point to which mail is delivered. This may be a property (private address), organisation, mailbox or even, very rarely, the name of an individual. These categories are derived from the Programmers’ Guide from Royal Mail. This is distinct from the addressed premise because there may be more than one organisation at an address. |
Gridlink | Gridlink is the name given to a joined-up Government initiative involving Royal Mail, the Office for National Statistics, National Records of Scotland (NRS), Land & Property Services and Ordnance Survey. All these organisations are involved in the georeferencing of postcodes and the relating of postcodes to administrative and National Health Service areas and so on. |
inward code or incode | See postcode. |
matched address | An address, resulting from a match between the OS MasterMap Topography Layer data and PAF, which has been allocated a coordinate position. The match may be a result of either manual or automatic matching, the latter encompassing both full and ‘fuzzy logic’ matching. |
National Grid reference (NGref) | The National Grid provides a unique reference system that can be applied to all Ordnance Survey maps of Great Britain. The map of Great Britain is covered by 100 km by 100 km grid squares, with the origin lying to the west of the Isles of Scilly. When a National Grid reference is quoted, the easting (left to right direction) is always given before the northing (upwards direction). A National Grid reference (to 1 metre) will identify the spatial position of the CPLC. |
non-geographic postcodes | Special non-geographic postcodes are allocated to single organisations who receive an exceptionally large amount of mail. These are included in Code-Point Open. |
outward code or outcode | See postcode. |
Postcode Address File (PAF) | PAF now contains the postal addresses and postcodes of approximately 28 million delivery points in Great Britain. |
Postal Address Location Feed (PALF) | The PAL Feed is provided to Ordnance Survey from GeoPlace, who have geocoded the PAF feed from Royal Mail, using source coordinates from Local Authorities in England, Wales & Scotland and Ordnance Survey. |
positional quality indicator (PQI) | The positional quality indicator is a flag used to indicate the positional accuracy of the coordinates allocated to each postcode record. There are seven PQI values for the positional quality of CPLCs. |
postal address | A postal address is a delivery point that is currently receiving mail. There may be many delivery points within an individual building structure as shown in OS MasterMap Topography Layer data. |
postcode | An abbreviated form of address made up of combinations of between six and eight alphanumeric characters. A postcode may cover between 1 and 100 addresses. The average number of addresses per postcode is 15. |
postcode area | An area given a unique alphabetic coding by Royal Mail to facilitate the delivering of mail. The area is identified by one or two alpha characters at the start of the full postcode, the letters being derived from a town, city or district falling within the postcode area. There are, at present, 120 postcode areas in Great Britain, for example, SO for Southampton, MK for Milton Keynes, B for Birmingham or W for London West. The postcode area code constitutes the first part of the outward code. |
postcode district | A sub-area of the postcode area, specified by the character sub-string within the first half of a full postcode, which may be numeric, alphabetic or alphanumeric; for example, 42 from MK42 6GH or 1A from W1A 4WW. There are approximately 2 986 postcode districts in Great Britain. Note: There are certain non-geographic districts. In these instances, a district code is allocated to cover all large users in the postcode area. |
postcode sector | A sub-area of a postcode district, whose area is identified by the number third from the end of a full postcode. There are approximately 11 200 postcode sectors in Great Britain. An example of a postcode sector code is 3, from GU12 3DH. |
postcode unit | A sub-area of a postcode sector, indicated by the two letters of the inward postcode, which identifies one or more small-user postcode delivery points or an individual large-user postcode. There are approximately 1.7 million postcode units in the UK. |
Ordnance Survey measures the data in its products in one or more of the ways set out in the definitions of data measures table below.
Data measure | Definition | Sub-measure | Definition |
---|---|---|---|
* When testing the data according to the dataset specification against the ‘real world’ or reference dataset.
Metadata, which is ISO 19115 UK GEMINI 2 compliant, can be found at https://www.data.gov.uk/dataset/e3d9cd8e-e702-4fc6-a674-c1f25eb5efab/ordnance-survey-code-point-open. Metadata .xml files can be found at http://www.ordnancesurvey.co.uk/oswebsite/xml/products/.
Completeness
Presence and absence of features against the specified data content
Omission
Features representing objects that conform to the specified data content but are not present in the data.
Commission
Features representing objects that do not conform to the specified data content but are present in the data.
Logical consistency
Degree of adherence to logical rules of data structure, attribution and relationships
Conceptual consistency
How closely the data follows the conceptual rules (or model).
Domain consistency
How closely the data values in the dataset match the range of values in the dataset specification.
Format consistency
The physical structure (syntax): how closely the data stored and delivered fits the database schema and agreed supply formats.
Topological consistency
The explicit topological references between features (connectivity) – according to specification.
Positional accuracy
Accuracy of the position of features
Absolute accuracy
How closely the coordinates of a point in the dataset agree with the coordinates of the same point on the ground (in the British National Grid reference system).
Relative accuracy
Positional consistency of a data point or feature in relation to other local data points or features within the same or another reference dataset.
Geometric fidelity
The ‘trueness’ of features to the shapes and alignments of the objects they represent*.
Temporal accuracy
Accuracy of temporal attributes and temporal relationships of features
Temporal consistency
How well-ordered events are recorded in the dataset (lifecycles).
Temporal validity (currency)
Validity of data with respect to time: the amount of real-world change that has been incorporated in the dataset that is scheduled for capture under current specifications.
Thematic accuracy
Classification of features and their attributes
Classification correctness
How accurately the attributes within the dataset record the information about objects*.