Managing .csv data

The Building Height Attribute .csv contains the attribute OS_TOPO_TOID, which is the topographic identifier (TOID) from OS MasterMap Topography Layer. This TOID is the attribute that is used to join the data to your OS MasterMap Topography Layer data. More information about TOIDs is available in the OS MasterMap Topography Layer Overview and the product's Technical Specification.

Checking TOIDs

TOIDs are stored in GML as a character string, with the prefix ‘osgb’, and can be comprised of either a 13-digit integer or a 16-digit integer after ‘osgb’ (for example, osgb1000001799480255). Ordnance Survey recommends that TOIDs are stored in this format.

Some of the methods available to translate OS MasterMap Topography Layer from GML alter the TOID from the format in which it is supplied in the GML. We have tested a small sample of these methods to ascertain how the TOID is dealt with during some translation processes. The following table details how some of the translators we tested deal with TOIDs; please look at your own data to investigate how your translation process will affect the TOIDs in your data.

TOID processing results for four translators

Translator
Processing of TOIDs

Miso InterpOSe

Removes ‘osgb’ from the beginning of TOIDs and adds three leading zeroes to 13-digit TOIDs.

Astun Loader

Preserves raw GML TOID format but renames column from TOID to fid.

Pitney Bowes MapInfo

Removes ‘osgb’ from the beginning of TOIDs.

QGIS

Preserves raw GML TOID format.

It is recommended that the OS_TOPO_TOID attribute in the OS MasterMap Topography Layer – Building Height Attribute .csv files be edited if it is found that the OS MasterMap Topography Layer TOID has been altered during the translation process, although how TOIDs are managed is up to your discretion. To aid this process, we have created some script that can be run alongside a gawk application that will alter TOIDs in the .csv files in two different ways. It is recommended that you create a backup copy of your .csv file before you make any changes to it.

Databases

If your TOID column requires altering the steps to do so are outlined in the importing data into a database page - detailing how this can be achieved in a database environment. This can also be achieved using the gawk process outlined below.

Reformatting the TOID attribute

There are several options which will allow the user to reformat the TOID attribute before joining it with the TOID field in OS MasterMap Topography Layer.

For smaller amounts of data, the simplest option is to use the free open-source text editor called Notepad++. Using this text editor, simply load up the Building Height Attribute .csv data and then, using the search and replace option, simply replace the 'osgb' part of the TOID name with either nothing or three leading zeroes ('000'). Remember to save the file with a new name once this procedure has been carried out.

The following sub-sections detail other steps that you can take to reformat the TOID attribute in the Building Height Attribute .csv, so that the format matches that of the TOID attribute in your translated

OS MasterMap Topography Layer, Topographic Area data. The reformatting of TOID is a very quick option which will take five minutes to set up and process on a national set of data for OS MasterMap Topography Layer – Building Height Attribute.

Two script reformatting options are presented below:

  • Script to remove ‘osgb’ from Building Height Attribute TOIDs: Recommended for use if your translator removes ‘osgb’ from your OS MasterMap Topography Layer TOIDs but does not add any leading zeroes to shorter 13-digit TOIDs.

  • Script to remove ‘osgb’ and add ‘000’ to 13-digit Building Height Attribute TOIDs: Recommended for use if your translator removes ‘osgb’ from your OS MasterMap Topography Layer TOIDs and adds three leading zeroes to shorter 13-digit TOIDs.

  1. Download the gawk tool and save the zip file in any location.

  2. Extract the ‘gawk.exe’ file to a directory containing the Building Height Attribute .csv file you wish to alter (this must contain only the .csv to alter and no other files). There must be no spaces in the entire file path to that directory as this will cause the process to fail.

  3. Both reformatting processes require you to create an .awk file and a .bat file alongside the gawk application. These need to be saved in the same directory as the gawk application and Building Height Attribute .csv file you wish to alter. The structure of the .awk file specified in the following sub-section remains unchanged regardless of which of the two script reformatting options you are using. Unlike the .awk file, the content of the .bat file is different for both reformatting options. The format to use in both instances is described after the .awk file detail below.

Generating the .awk file

  1. Copy and paste the following code into a new text editor document (such as Notepad++):

# COMMAND LINE gawk -v leadingZeroes=TRUE|FALSE -f bha_fixTOID.awk infile > outfile

BEGIN {
    FS = ","
    OFS = ","
}

{
    sub("osgb","",$1)
    if (leadingZeroes == "TRUE")
        $1 = sprintf("%016d",$1)
    print
}
  1. Save the file as bha_fixTOID.awk in the same directory as the gawk application and Building Height Attribute .csv file.

Generating the .bat file

Script to remove ‘osgb’ from TOIDs:

If using the first reformatting option (i.e. Script to remove 'osgb' from TOIDs), please copy the following text into a new text editor document and save the document with a .bat extension in the same directory as the gawk application, .awk file and .csv file:

// Some code@rem --- Line below to remove OS_TOPO_TOID "osgb" prefix only
FOR /F "tokens=*" %%A IN ('dir *.csv /s/b') DO (gawk -v leadingZeroes=FALSE -f bha_fixTOID.awk
%%A > _%%~NA%%~XA)
pause
exit

Script to remove ‘osgb’ and add ‘000’ to 13-digit Building Height Attribute TOIDs:

If using the second reformatting option (i.e. Script to remove ‘osgb’ and add ‘000’ to 13-digit Building Height Attribute TOIDs), please copy the following text into a new text editor document and save the document with a .bat extension in the same directory as the gawk application, .awk file and .csv file:

@rem --- Line below to remove OS_TOPO_TOID "osgb" prefix + add leading zeroes
FOR /F "tokens=*" %%A IN ('dir *.csv /s/b') DO (gawk -v leadingZeroes=TRUE -f bha_fixTOID.awk
%%A > _%%~NA%%~XA)
pause
exit

Executing the reformatting operation

  1. For both reformatting options, navigate to the directory where the .bat file is saved and execute it by double clicking the file. An MS-DOS window will appear.

  2. Once the process is complete, press the return key to close the window.

The directory should now contain a new Building Height Attribute .csv file with the same name as the original but starting with an underscore. This .csv file will contain a copy of the original with the TOID attribute altered.

Merging .csv files

It is likely that you will wish to merge 5 X 5km² tiles together to create an area of interest before loading this into a GI application. The following procedure can be followed to undertake this operation:

  1. Place all the .csv files into an empty folder.

  2. Copy and paste the following script into a text editor (such as Notepad++), and, once copied, give the file a name which explains its function. In this case, we have called the file mergedBHAdata.bat.

copy *.csv mergedBHAdata.csv

Navigate to the directory where the .bat file is saved and execute by double clicking the file. This will create a new file called mergedBHAdata.csv from all the component .csv files in the folder.

  1. Delete the original 5 X 5km² from the folder, leaving just the newly created mergedBHAdata.csv file in the folder.

  2. Copy and paste the following script into a text editor (such as Notepad++), and, once copied, give the file a name which explains its function. In this example, we have called the file Append_BHA_Header.bat.

copy BHA_Header.csv+ mergedBHAdata.csv BHA_Data.csv
  1. Download the Building Height Attribute header file from the 'OS MasterMap Topography Layer – Building Height Attribute Downloads' page.

  2. Place the downloaded header file for Building Height Attribute data into the same folder as the merged Building Height Attribute data file, along with the batch file that was created in Step 4.

  3. Run the batch file. A new merged .csv file containing the header will be created called BHA_Data.csv.

Last updated

Was this helpful?