Managing .csv data
Last updated
Was this helpful?
Last updated
Was this helpful?
The Building Height Attribute .csv contains the attribute OS_TOPO_TOID, which is the topographic identifier (TOID) from OS MasterMap Topography Layer. This TOID is the attribute that is used to join the data to your OS MasterMap Topography Layer data. More information about TOIDs is available in the and the product's .
TOIDs are stored in GML as a character string, with the prefix ‘osgb’, and can be comprised of either a 13-digit integer or a 16-digit integer after ‘osgb’ (for example, osgb1000001799480255). Ordnance Survey recommends that TOIDs are stored in this format.
Some of the methods available to translate OS MasterMap Topography Layer from GML alter the TOID from the format in which it is supplied in the GML. We have tested a small sample of these methods to ascertain how the TOID is dealt with during some translation processes. The following table details how some of the translators we tested deal with TOIDs; please look at your own data to investigate how your translation process will affect the TOIDs in your data.
Miso InterpOSe
Removes ‘osgb’ from the beginning of TOIDs and adds three leading zeroes to 13-digit TOIDs.
Astun Loader
Preserves raw GML TOID format but renames column from TOID to fid.
Pitney Bowes MapInfo
Removes ‘osgb’ from the beginning of TOIDs.
QGIS
Preserves raw GML TOID format.
It is recommended that the OS_TOPO_TOID attribute in the OS MasterMap Topography Layer – Building Height Attribute .csv files be edited if it is found that the OS MasterMap Topography Layer TOID has been altered during the translation process, although how TOIDs are managed is up to your discretion. To aid this process, we have created some script that can be run alongside a gawk application that will alter TOIDs in the .csv files in two different ways. It is recommended that you create a backup copy of your .csv file before you make any changes to it.
If your TOID column requires altering the steps to do so are outlined in the - detailing how this can be achieved in a database environment. This can also be achieved using the gawk process outlined below.
There are several options which will allow the user to reformat the TOID attribute before joining it with the TOID field in OS MasterMap Topography Layer.
The following sub-sections detail other steps that you can take to reformat the TOID attribute in the Building Height Attribute .csv, so that the format matches that of the TOID attribute in your translated
OS MasterMap Topography Layer, Topographic Area data. The reformatting of TOID is a very quick option which will take five minutes to set up and process on a national set of data for OS MasterMap Topography Layer – Building Height Attribute.
Two script reformatting options are presented below:
Script to remove ‘osgb’ from Building Height Attribute TOIDs: Recommended for use if your translator removes ‘osgb’ from your OS MasterMap Topography Layer TOIDs but does not add any leading zeroes to shorter 13-digit TOIDs.
Script to remove ‘osgb’ and add ‘000’ to 13-digit Building Height Attribute TOIDs: Recommended for use if your translator removes ‘osgb’ from your OS MasterMap Topography Layer TOIDs and adds three leading zeroes to shorter 13-digit TOIDs.
Download the gawk tool and save the zip file in any location.
Extract the ‘gawk.exe’ file to a directory containing the Building Height Attribute .csv file you wish to alter (this must contain only the .csv to alter and no other files). There must be no spaces in the entire file path to that directory as this will cause the process to fail.
Both reformatting processes require you to create an .awk file and a .bat file alongside the gawk application. These need to be saved in the same directory as the gawk application and Building Height Attribute .csv file you wish to alter. The structure of the .awk file specified in the following sub-section remains unchanged regardless of which of the two script reformatting options you are using. Unlike the .awk file, the content of the .bat file is different for both reformatting options. The format to use in both instances is described after the .awk file detail below.
Copy and paste the following code into a new text editor document (such as Notepad++):
Save the file as bha_fixTOID.awk in the same directory as the gawk application and Building Height Attribute .csv file.
If using the first reformatting option (i.e. Script to remove 'osgb' from TOIDs), please copy the following text into a new text editor document and save the document with a .bat extension in the same directory as the gawk application, .awk file and .csv file:
If using the second reformatting option (i.e. Script to remove ‘osgb’ and add ‘000’ to 13-digit Building Height Attribute TOIDs), please copy the following text into a new text editor document and save the document with a .bat extension in the same directory as the gawk application, .awk file and .csv file:
For both reformatting options, navigate to the directory where the .bat file is saved and execute it by double clicking the file. An MS-DOS window will appear.
Once the process is complete, press the return key to close the window.
The directory should now contain a new Building Height Attribute .csv file with the same name as the original but starting with an underscore. This .csv file will contain a copy of the original with the TOID attribute altered.
It is likely that you will wish to merge 5 X 5km² tiles together to create an area of interest before loading this into a GI application. The following procedure can be followed to undertake this operation:
Place all the .csv files into an empty folder.
Copy and paste the following script into a text editor (such as Notepad++), and, once copied, give the file a name which explains its function. In this case, we have called the file mergedBHAdata.bat.
Navigate to the directory where the .bat file is saved and execute by double clicking the file. This will create a new file called mergedBHAdata.csv from all the component .csv files in the folder.
Delete the original 5 X 5km² from the folder, leaving just the newly created mergedBHAdata.csv file in the folder.
Copy and paste the following script into a text editor (such as Notepad++), and, once copied, give the file a name which explains its function. In this example, we have called the file Append_BHA_Header.bat.
Place the downloaded header file for Building Height Attribute data into the same folder as the merged Building Height Attribute data file, along with the batch file that was created in Step 4.
Run the batch file. A new merged .csv file containing the header will be created called BHA_Data.csv.
For smaller amounts of data, the simplest option is to use the . Using this text editor, simply load up the Building Height Attribute .csv data and then, using the search and replace option, simply replace the 'osgb' part of the TOID name with either nothing or three leading zeroes ('000'). Remember to save the file with a new name once this procedure has been carried out.
If the format of the TOIDs in your translated OS MasterMap Topography Layer data matches that in the Building Height Attribute .csv, please skip to .
In order to use either reformatting fix, you will need to , a tool that uses pattern-matching principles to do simple data re-formatting tasks.
Download the Building Height Attribute header file from the .