sup3r.utilities.era_downloader.EraDownloader#

class EraDownloader(year, month, area, levels, file_pattern, overwrite=False, variables=None, product_type='reanalysis')[source]#

Bases: object

Class to handle ERA5 downloading, variable renaming, and file combinations.

Initialize the class.

Parameters:
  • year (int) – Year of data to download.

  • month (int) – Month of data to download.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • file_pattern (str) – Pattern for combined monthly output file. Must include year and month format keys. e.g. ‘era5_{year}_{month}_combined.nc’

  • overwrite (bool) – Whether to overwrite existing files.

  • variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

Methods

add_pressure(ds)

Add pressure to dataset

all_vars_exist(year, file_pattern, variables)

Check if all yearly variable files for the requested year exist.

convert_z(ds, name)

Convert z to given height variable

download_file(variables, time_dict, area, ...)

Download either single-level or pressure-level file

download_process_combine()

Run the download routine.

get_cds_client()

Get the copernicus climate data store (CDS) API object for ERA downloads.

get_hours()

ERA5 is hourly and EDA is 3-hourly.

get_monthly_file()

Download level and surface files, process variables, and combine processed files.

get_tmp_file(file)

Get temp file for given file.

make_yearly_file(year, file_pattern, variables)

Combine yearly variable files into a single file.

make_yearly_var_file(year, ...[, chunks, ...])

Combine monthly variable files into a single yearly variable file.

prep_var_lists(variables)

Create surface and level variable lists based on requested variables.

process_and_combine()

Process variables and combine.

process_level_file()

Convert geopotential to geopotential height.

process_surface_file()

Rename variables and convert geopotential to geopotential height.

run(year, area, levels, monthly_file_pattern)

Run routine for all requested months in the requested year.

run_for_var(year, area, levels, ...[, ...])

Run routine for all requested months in the requested year for the given variable.

run_month(year, month, area, levels, ...[, ...])

Run routine for the given month and year.

run_qa(file[, res_kwargs, log_file])

Check for NaN values and log min / max / mean / stds for all variables.

Attributes

days

Get list of days for the requested month

level_file

Get name of file with variables from pressure level download

monthly_file

Name of file with all surface and level variables for a given month and year.

surface_file

Get name of file with variables from single level download

variables

Get list of requested variables

get_hours()[source]#

ERA5 is hourly and EDA is 3-hourly. Check and warn for incompatible requests.

property variables#

Get list of requested variables

property days#

Get list of days for the requested month

property monthly_file#

Name of file with all surface and level variables for a given month and year.

property surface_file#

Get name of file with variables from single level download

property level_file#

Get name of file with variables from pressure level download

classmethod get_tmp_file(file)[source]#

Get temp file for given file. Then only needed variables will be written to the given file.

prep_var_lists(variables)[source]#

Create surface and level variable lists based on requested variables.

static get_cds_client()[source]#

Get the copernicus climate data store (CDS) API object for ERA downloads.

download_process_combine()[source]#

Run the download routine.

classmethod download_file(variables, time_dict, area, out_file, level_type, levels=None, product_type='reanalysis', overwrite=False)[source]#

Download either single-level or pressure-level file

Parameters:
  • variables (list) – List of variables to download

  • time_dict (dict) – Dictionary with year, month, day, time entries.

  • area (list) – List of bounding box coordinates. e.g. [max_lat, min_lon, min_lat, max_lon]

  • out_file (str) – Name of output file

  • level_type (str) – Either ‘single’ or ‘pressure’

  • levels (list) – List of pressure levels to download, if level_type == ‘pressure’

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

  • overwrite (bool) – Whether to overwrite existing file

process_surface_file()[source]#

Rename variables and convert geopotential to geopotential height.

add_pressure(ds)[source]#

Add pressure to dataset

Parameters:

ds (Dataset) – xr.Dataset() object for which to add pressure

Returns:

ds (Dataset)

convert_z(ds, name)[source]#

Convert z to given height variable

Parameters:
  • ds (Dataset) – xr.Dataset() object for new file

  • name (str) – Variable name. e.g. zg or orog, typically

Returns:

ds (Dataset) – xr.Dataset() object for new file with new height variable written.

process_level_file()[source]#

Convert geopotential to geopotential height.

process_and_combine()[source]#

Process variables and combine.

get_monthly_file()[source]#

Download level and surface files, process variables, and combine processed files. Includes checks for shape and variables.

classmethod all_vars_exist(year, file_pattern, variables)[source]#

Check if all yearly variable files for the requested year exist.

Parameters:
  • year (int) – Year used for data download.

  • file_pattern (str) – Pattern for variable file. Must include year and var format keys. e.g. ‘era5_{year}_{var}_combined.nc’

  • variables (list) – Variables that should have been downloaded

Returns:

bool – True if all monthly variable files for the requested year and month exist.

classmethod run_month(year, month, area, levels, file_pattern, overwrite=False, variables=None, product_type='reanalysis')[source]#

Run routine for the given month and year.

Parameters:
  • year (int) – Year of data to download.

  • month (int) – Month of data to download.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • file_pattern (str) – Pattern for combined monthly output file. Must include year and month format keys. e.g. ‘era5_{year}_{month}_combined.nc’

  • overwrite (bool) – Whether to overwrite existing files.

  • variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

classmethod run_for_var(year, area, levels, monthly_file_pattern, yearly_file_pattern=None, months=None, overwrite=False, max_workers=None, variable=None, product_type='reanalysis', chunks='auto', res_kwargs=None)[source]#

Run routine for all requested months in the requested year for the given variable.

Parameters:
  • year (int) – Year of data to download.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • monthly_file_pattern (str) – Pattern for monthly output files. Must include year, month, and var format keys. e.g. ‘era5_{year}_{month}_{var}.nc’

  • yearly_file_pattern (str) – Pattern for yearly output files. Must include year and var format keys. e.g. ‘era5_{year}_{var}.nc’

  • months (list | None) – List of months to download data for. If None then all months for the given year will be downloaded.

  • overwrite (bool) – Whether to overwrite existing files.

  • max_workers (int) – Max number of workers to use for downloading and processing monthly files.

  • variable (str) – Variable to download.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.

classmethod run(year, area, levels, monthly_file_pattern, yearly_file_pattern=None, months=None, overwrite=False, max_workers=None, variables=None, product_type='reanalysis', chunks='auto', combine_all_files=False, res_kwargs=None)[source]#

Run routine for all requested months in the requested year.

Parameters:
  • year (int) – Year of data to download.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • monthly_file_pattern (str) – Pattern for monthly output file. Must include year, month, and var format keys. e.g. ‘era5_{year}_{month}_{var}_combined.nc’

  • yearly_file_pattern (str) – Pattern for yearly output file. Must include year and var format keys. e.g. ‘era5_{year}_{var}_combined.nc’

  • months (list | None) – List of months to download data for. If None then all months for the given year will be downloaded.

  • overwrite (bool) – Whether to overwrite existing files.

  • max_workers (int) – Max number of workers to use for downloading and processing monthly files.

  • variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’

  • combine_all_files (bool) – Whether to combine separate yearly variable files into a single yearly file with all variables included

classmethod make_yearly_var_file(year, monthly_file_pattern, yearly_file_pattern, variable, chunks='auto', res_kwargs=None)[source]#

Combine monthly variable files into a single yearly variable file.

Parameters:
  • year (int) – Year used to download data

  • monthly_file_pattern (str) – File pattern for monthly variable files. Must have year, month, and var format keys. e.g. ‘./era_{year}_{month}_{var}_combined.nc’

  • yearly_file_pattern (str) – File pattern for yearly variable files. Must have year and var format keys. e.g. ‘./era_{year}_{var}_combined.nc’

  • variable (string) – Variable name for the files to be combined.

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.

  • res_kwargs (None | dict) – Keyword arguments for base resource handler, like xr.open_mfdataset. This is passed to a Loader object and then used in the base loader contained by that obkect.

classmethod make_yearly_file(year, file_pattern, variables, chunks='auto', res_kwargs=None)[source]#

Combine yearly variable files into a single file.

Parameters:
  • year (int) – Year for the data to make into a yearly file.

  • file_pattern (str) – File pattern for output files. Must have year and var format keys. e.g. ‘./era_{year}_{var}_combined.nc’

  • variables (list) – List of variables corresponding to the yearly variable files to combine.

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.

  • res_kwargs (None | dict) – Keyword arguments for base resource handler, like xr.open_mfdataset. This is passed to a Loader object and then used in the base loader contained by that obkect.

classmethod run_qa(file, res_kwargs=None, log_file=None)[source]#

Check for NaN values and log min / max / mean / stds for all variables.