Operation¶

Parameter files¶

Parameter files are used to control GLOBSIM. The parameter files follow the TOML standard (https://toml.io/en/). Each step in the procedure (download, interpolate, scale) can be split into its own file, or combined. The important thing is that they always have the appropriate headings (e.g. [download], [interpolate], or [scale]). The parameter files should be in the /par subdirectory of the project directory.

Downloading¶

Keyword	Description
`project_directory`	This is the full path to the project directory which stores the downloaded files and the control files. It should include a subdirectory called /par which contains parameter files (control files) as well as a csv describing the sites to which data are scaled.
`credentials_directory`	The location of the credential files (e.g. .merrarc and .jrarc). Does not apply to credential file .ecmwfapi which defaults to your home directory. It is recommended to set this parameter to your home directory
`chunk_size`	How many days to include in each download file. Larger chunk size values mean that a smaller number of files will be downloaded, each with a larger size
`bbN`	Coordinates for northern boundary of bounding box describing the area for which data will be downloaded. Coordinates must be in decimal degrees.
`bbS`	Coordinates for southern boundary of bounding box describing the area for which data will be downloaded. Coordinates must be in decimal degrees.
`bbW`	Coordinates for western boundary of bounding box describing the area for which data will be downloaded. Coordinates must be in decimal degrees with negative values for locations west of 0.
`bbE`	Coordinates for eastern boundary of bounding box describing the area for which data will be downloaded. Coordinates must be in decimal degrees with negative values for locations west of 0.
`ele_min`	Minimum elevation that will be downloaded. Recommended to leave at 0.
`ele_max`	Maximum elevation that will be downloaded. Should be at least 2500.
`beg`	First date for which data is downloaded YYYY/MM/DD
`end`	Last date for which data is downloaded YYYY/MM/DD
`variables`	Which variables should be downloaded from the server. The variables names come from the CF Standard Names table. It is recommended that the variables parameter be left to include all relevant variables: air_temperature, relative_humidity, precipitation_amount, downwelling_longwave_flux_in_air, downwelling_longwave_flux_in_air_assuming_clear_sky, downwelling_shortwave_flux_in_air, downwelling_shortwave_flux_in_air_assuming_clear_sky, wind_from_direction, wind_speed

Note

To check download progress, you can use your credentials to log onto the website for JRA and ERA5 (CDS API)

Interpolating¶

Keyword	Description
`project_directory`	This is the full path to the project directory which stores the downloaded files and the control files. It should include a subdirectory called /par which contains parameter files (control files) as well as a csv describing the sites to which data are scaled.
`output_directory`	The directory to which interpolated files will be saved. If not provided, or if the directory does not exist, globsim will write to the `project_directory` by default.
`station_list`	The filename of a csv containing site names and coordinates. If just the filename is specified without a path, globsim will look in the `par` folder within the project directory. If a full filepath is used, globsim will use that file instead. Typically the same `station_list` file will also be used in the scaling step.
`chunk_size`	How many time-steps to interpolate at once. This helps memory management. Keep small for large area files and/or computers with little memory. Make larger to get performance improvements on computers with lots of memory.
`beg`	Beginning of date range for which data will be interpolated in `YYYY/MM/DD` format. Note that this date range must include dates that are represented in the downloaded data.
`end`	End of date range for which data will be interpolated in `YYYY/MM/DD` format. Note that this date range must include dates that are represented in the downloaded data.
`variables`	Which variables should be downloaded from the server. The variables names come from the CF Standard Names table. It is recommended that the variables parameter be left to include all relevant variables

Rescaling¶

Keyword	Description
`project_directory`	This is the full path to the project directory that contains the interpolated files. By default, it contains a subdirectory called /par where site list csv files are kept.
`output_directory`	The directory to which scaled files will be saved. If not provided, or if the directory does not exist, globsim will write to the `project_directory` by default.
`station_list`	The filename (without path) of csv containing site information such as sitelist.csv (note that this must match the interpolation parameter file)
`output_file`	Path to output netCDF to be created.
`overwrite`	Either `true` or `false`. Whether or not to overwrite the `output_file` if it exists.
`time_step`	The desired output time step in hours
`kernels`	Which processing kernels should be used. Missing or misspelled kernels will be ignored by globsim.
`scf`	(optional) snow correction factor, a positive real number used to manually scale the precipitation for all sites.
`rh_approximation`	(optional) which relative humidity approximation to use. Choose from ‘rh_liston’ (default) or ‘rh_lawrence’

Example Parameter File¶

Here is an example of a TOML parameter file with all three sections (download, interpolate, scale) combined into one section.

title = "Globsim Control File"

[download]
# logistics
project_directory = "/opt/globsim/examples/Example1"
credentials_directory = "/root"

# chunk size for splitting files and download [days]
chunk_size = 2

# area bounding box [decimal degrees]
bbN = 66
bbS = 62
bbW = -112
bbE = -108

# ground elevation range within area [m]
ele_min = 0
ele_max = 2500

# time slice [YYYY/MM/DD]
beg = "2017/07/01"
end = "2017/07/05"

# variables to download [CF Standard Name Table]
variables = ["air_temperature", "relative_humidity", "wind_speed", "wind_from_direction", "precipitation_amount", "downwelling_shortwave_flux_in_air", "downwelling_longwave_flux_in_air", "downwelling_shortwave_flux_in_air_assuming_clear_sky", "downwelling_longwave_flux_in_air_assuming_clear_sky"]

[interpolate]
# Path to the parent directory of /par - It should match the download and scale files
project_directory = "/opt/globsim/examples/Example1"

# Filename (without path) of csv containing site information (must match scaling control file)
station_list = "siteslist.csv"

# How many time steps to interpolate at once? This helps memory management.
# Keep small for large area files and small memory computer, make larger to get
# speed on big machines and when working with small area files.
# for a small area, we suggest values up to 2000, but consider the memory limit of your computer
chunk_size = 2000

# time slice [YYYY/MM/DD] assuming 00:00 hours
beg = "2017/07/01"
end = "2017/07/05"

# variables to interpolate [CF Standard Name Table]
variables = ["air_temperature", "relative_humidity", "wind_speed", "wind_from_direction", "precipitation_amount", "downwelling_shortwave_flux_in_air", "downwelling_longwave_flux_in_air", "downwelling_shortwave_flux_in_air_assuming_clear_sky", "downwelling_longwave_flux_in_air_assuming_clear_sky"]

[scale]
# Path to the parent directory of /par - It should match the download and interpolate files
project_directory = "/opt/globsim/examples/Example1"

# Filename (without path) of csv containing site information (must match interpolation control file)
station_list = "siteslist.csv"

# processing kernels to be used.  Unavailable kernels will be ignored
kernels = ["PRESS_Pa_pl", "AIRT_C_pl", "AIRT_C_sur", "PREC_mm_sur", "RH_per_sur", "WIND_sur", "SW_Wm2_sur", "LW_Wm2_sur", "SH_kgkg_sur"]

# desired time step for output data [hours]
time_step = 1

# Should the output file be overwritten if it exists?
overwrite = true

Station list for interpolation¶

This is an example of a Globsim station list file. The resulting netCDF file will use the station numbers as identifiers.

station_number,station_name,longitude_dd,latitude_dd,elevation_m
1,yellowknife_airport,-114.44234,62.46720,207
2, ekati_airport,-110.60804,64.70591,461

More information about the station list can be found on the The station_list file page

Project directory¶

The project directory is the location to which data is downloaded and where processed data is found. The project directory is subdivided by re-analysis type and by the type of derived product:

project_a/              (project directory)
project_a/par/          (parameter files for data download and interpolation)
project_a/jra-55/       (JRA-55 data)
project_a/eraint/       (ERA-Interim data)
project_a/era5/         (ERA5 data)
project_a/merra2/       (MERRA 2 data)
project_a/station/      (data interpolated to stations)
project_a/scale/        (final scaled files)

Operation¶

Parameter files¶

Downloading¶

Interpolating¶

Rescaling¶

Example Parameter File¶

Station list for interpolation¶

Project directory¶

Table of Contents

Previous topic

Next topic

This Page