Data formats
The AddressBase product will be distributed as a comma-separated values (CSV) file or Geography Markup Language (GML) version 3.2. Both of these formats can either be supplied as a full supply or a change-only update (COU) supply.
CSV
The CSV supply of AddressBase means:
There will be one record per line in each file.
Fields will be separated by commas.
String fields will be delimited by double quotes.
No comma will be placed at the end of each row in the file.
Records will be terminated by Carriage Return / Line Feed.
Double quotes inside strings will be escaped by doubling.
Where a field has no value in a record, two commas will be placed together in the record (one for the end of the previous field and one for the end of the null field). Where the null field is a text field double quotes will be included between the two commas, for example:
AddressBase CSV data will be transferred using Unicode encoded in UTF-8. Unicode includes all the characters in ISO-8859-14 (Welsh characters). Some accented characters are encoded differently.
The transfer will normally be in a single file, but the data can be split into multiple files using volume numbers. Most files will only be split where there are more than one million records.
The header row for the CSV is supplied separately and can be downloaded from the product support pages.
GML
A GML document is described using a GML Schema. The AddressBase schema document (addressbase.xsd), defines the features in AddressBase GML.
The application schema uses the following XML namespaces, for which definitions are available as given here:
gml
xsi
xlink
Features
Each feature within the AddressBaseSupplySet:FeatureCollection
is encapsulated in the following member element according to its feature type:
<abpl:addressMember>
Address
The UPRN of the feature is provided in the XML attribute of the gml:id
Envelope
In the GML supply you can determine the extent of your supply by the <gml: Envelope>
. For example:
Last updated
Was this helpful?