Automated open data download
In this tutorial we'll collect Ordnance Survey data from the Downloads API with an automated download and extract process.
Tools and APIs
We will walk through how to fetch the OS Terrain50 Digital Elevation Model (DEM) dataset from the OS Downloads API using NodeJS and the command line, then unzipping and extracting all of the appropriate .asc
files into a single folder.
We will be using:
NodeJS and npm, with axios and extract-zip npm packages;
fs
andpath
standard modules.Command line
Tutorial
Working with large datasets can be a challenge. With a few programming tools, however, we can radically improve the efficiency of collecting and manipulating these datasets.
We'll be using NodeJS to download and extract Ordnance Survey's Terrain50 Digital Elevation Model dataset from the OS Downloads API.
Download zipped directory with the data.
Unzip all contained zipped directories.
Copy all
.asc
files into one folder, ready to be loaded into QGIS for our Shaded Relief Map tutorial.
This tutorial was created using NodeJS v14.1.0. If you don't already have Node installed, here is a great tutorial on installing Node and npm on Windows and Mac.
This tutorial will walk you through creating this code from scratch. If you would rather download, install and run it all, there are instructions for that at the bottom of this page.
Download & Extract
First, we'll download the Terrain 50 dataset from the OS Downloads API. In your terminal, make a directory where you'll want to store your project. At the outset we'll install the axios
package from NPM, which is used for HTTP requests, and extract-zip
, which we'll use a bit later on for extracting zipped folders.
If you're not familiar with Node, this npm install {package names}
command will create a node_modules
folder and download packages there.
Download File
We'll be breaking our Node program into modules to keep it organised. For our first step - downloading the dataset - create a file called downloadFile.js
in the os-downloads-tutorial
directory.
We'll be using the fs
module to write data to the disk as it downloads via an axios
HTTP request.
This function will fetch the resource at the url
passed in, and write the file returned to the local directory as filename
. We will call it from the function we'll export, adding in some error handling and visual feedback so the user knows the file is downloading. We'll also unzip the downloaded file into the targetdir
.
In the same file, below the download_file
declaration, create the following function:
Recursive Unzip
Before we import and call the code above, we'll also define a script that will unzip all the zipped folders in the downloaded zipfile.
Create a file called unzipAll.js
. This function will accept one parameter - a path to a directory (zipped or not). We'll then pull paths of all contained zipfiles, extract the contents, and delete the zipfiles. Since zipped directories can contain zipped sub-directories and files, we need to execute this recursively until all files ending in .zip
are extracted and deleted.
getFilesPaths()
getFilesPaths()
You'll notice we required a module to unzipAll
- getFilePaths.js
.
This function accepts a directory path string, then loops through the directory, building an array of all the paths to directories and files contained.
A note: this is another recursive function - one that calls itself. In this way it is able to move through the directory structure no matter its depth.
Now we've downloaded the DEM data from the OS Downloads API and unzipped all the contained files. It should be ready to bring into QGIS.
app.js
app.js
Let's put this all together in a new file, /app.js
. We'll import the modules we need - Node's fs
and path
, along with the custom modules we wrote above.
We don't want to have to click through every single folder to select the .asc
files we need to load into QGIS - and QQGIS won't accept a directory containing loads of subdirectories and files of different types.
Fortunately, with Node - and the code we've already written - we can easily copy all the .asc
files from their locations in the directory structure into one folder. (This could be done with many other programming languages like Python, bash, Ruby, C++ - the power of code.)
We will place all of this code inside the body of an asynchronous immediately-invoked function expression, which you can see in /app.js
. By wrapping the function expression in parentheses, we can immediately invoke it (with or without arguments): (function (param) { /* body ...*/})(arg)
This pattern is necessary for us to be able to use async / await
syntax, which makes it clean and easy to write code that will halt until the end of processes like downloading a large file.
Running the program
We now have a complete Node program and modules, which we can execute by running node app.js
on the command line in our os-downloads-tutorial
directory.
This may take a little time as the OS Terrain 50 dataset is 161MB. Once this is completed, there should be a new folder, working_data/asc_skye
with the .asc
files, ready to work with in QGIS. If you want to complete the tutorial and create a shaded relief map with the DEM data downloaded, find the tutorial here.
Download, install and run
If you'd rather just download and run the code we described in the tutorial, you can clone the repository, install the npm packages and run it using the following commands:
If all goes well you'll see the console printing statements as the download and extract process starts!
Let us know if you automate a process fetching, manipulating, analysing or storing OS data - we'd love to know. Tweet at @OrdnanceSurvey and tag #OSDeveloper.
Last updated