sup3r.utilities.era_downloader.EraDownloader#

class EraDownloader(year, month, days, area, levels, file_pattern, overwrite=False, variables=None, product_type='reanalysis')[source]#

Bases: object

Class to handle ERA5 downloading, variable renaming, and file combinations.

Examples

Download u / v at 10m, 100m, and specified pressure levels, for CONUS. Include orog (surface elevation), and zg (geopotential height), which are required for converting from pressure levels to heights. This downloads the data for all months, combine into a yearly file, and standardize the data for use in the sup3r package.

>>> area = [53, -132, 20, -60] # [max_lat, min_lon, min_lat, max_lon]
>>> pressure_levels = [700, 800, 900, 925, 950, 975, 1000] # in hPa
>>> monthly_fpattern = "./{year}/{month}/{year}_{month}_{var}.nc"
>>> yearly_fpattern = "./{year}/{year}_{var}.nc"
>>> variables = ['u', 'v', 'orog', 'zg']
>>> EraDownloader.run(
        year=2021,
        area=area,
        months=list(range(1, 13)),
        levels=pressure_levels,
        monthly_file_pattern=monthly_fpattern,
        yearly_file_pattern=yearly_fpattern,
        variables=variables,
        max_workers=3,
        product_type='reanalysis'
    )

Initialize the class.

Parameters:
  • year (int) – Year of data to download.

  • month (int) – Month of data to download.

  • days (list) – List of days to download data for the given month.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • file_pattern (str) – Pattern for combined monthly output file. Must include year and month format keys. e.g. ‘era5_{year}_{month}_combined.nc’

  • overwrite (bool) – Whether to overwrite existing files.

  • variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

Methods

add_pressure(ds)

Add pressure to dataset

all_vars_exist(year, file_pattern, variables)

Check if all yearly variable files for the requested year exist.

convert_z(ds, name)

Convert z to given height variable

download_file(variables, time_dict, area, ...)

Download either single-level or pressure-level file

download_process_combine()

Run the download routine.

get_cds_client()

Get the copernicus climate data store (CDS) API object for ERA downloads.

get_hours()

ERA5 is hourly and EDA is 3-hourly.

get_monthly_file()

Download level and surface files, process variables, and combine processed files.

make_yearly_file(year, file_pattern, variables)

Combine yearly variable files into a single file.

make_yearly_var_file(year, ...[, chunks, ...])

Combine monthly variable files into a single yearly variable file.

prep_var_lists(variables)

Create surface and level variable lists based on requested variables.

process_and_combine()

Process variables and combine surface and level files into a single monthly file.

process_level_file()

Convert geopotential to geopotential height.

process_surface_file()

Rename variables and convert geopotential to geopotential height.

run(year, area, levels, monthly_file_pattern)

Run routine for all requested months in the requested year.

run_for_var(year, area, levels, ...[, ...])

Run routine for all requested months in the requested year for the given variable.

run_month(year, month, days, area, levels, ...)

Run routine for the given month and year.

run_qa(file[, res_kwargs, log_file])

Check for NaN values and log min / max / mean / stds for all variables.

Attributes

level_file

Get name of file with variables from pressure level download

monthly_file

Name of file with all surface and level variables for a given month and year.

surface_file

Get name of file with variables from single level download

variables

Get list of requested variables

get_hours()[source]#

ERA5 is hourly and EDA is 3-hourly. Check and warn for incompatible requests.

property variables#

Get list of requested variables

property monthly_file#

Name of file with all surface and level variables for a given month and year.

property surface_file#

Get name of file with variables from single level download

property level_file#

Get name of file with variables from pressure level download

prep_var_lists(variables)[source]#

Create surface and level variable lists based on requested variables.

static get_cds_client()[source]#

Get the copernicus climate data store (CDS) API object for ERA downloads.

download_process_combine()[source]#

Run the download routine.

classmethod download_file(variables, time_dict, area, out_file, level_type, levels=None, product_type='reanalysis', overwrite=False)[source]#

Download either single-level or pressure-level file

Parameters:
  • variables (list) – List of variables to download

  • time_dict (dict) – Dictionary with year, month, day, time entries.

  • area (list) – List of bounding box coordinates. e.g. [max_lat, min_lon, min_lat, max_lon]

  • out_file (str) – Name of output file

  • level_type (str) – Either ‘single’ or ‘pressure’

  • levels (list) – List of pressure levels to download, if level_type == ‘pressure’

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

  • overwrite (bool) – Whether to overwrite existing file

process_surface_file()[source]#

Rename variables and convert geopotential to geopotential height.

add_pressure(ds)[source]#

Add pressure to dataset

Parameters:

ds (Dataset) – xr.Dataset() object for which to add pressure

Returns:

ds (Dataset)

convert_z(ds, name)[source]#

Convert z to given height variable

Parameters:
  • ds (Dataset) – xr.Dataset() object for new file

  • name (str) – Variable name. e.g. zg or orog, typically

Returns:

ds (Dataset) – xr.Dataset() object for new file with new height variable written.

process_level_file()[source]#

Convert geopotential to geopotential height.

process_and_combine()[source]#

Process variables and combine surface and level files into a single monthly file. Proccesing includes renaming variables and converting to standard units.

get_monthly_file()[source]#

Download level and surface files, process variables, and combine processed files. Includes checks for shape and variables.

classmethod all_vars_exist(year, file_pattern, variables)[source]#

Check if all yearly variable files for the requested year exist.

Parameters:
  • year (int) – Year used for data download.

  • file_pattern (str) – Pattern for variable file. Must include year and var format keys. e.g. ‘era5_{year}_{var}_combined.nc’

  • variables (list) – Variables that should have been downloaded

Returns:

bool – True if all monthly variable files for the requested year and month exist.

classmethod run_month(year, month, days, area, levels, file_pattern, overwrite=False, variables=None, product_type='reanalysis')[source]#

Run routine for the given month and year.

Parameters:
  • year (int) – Year of data to download.

  • month (int) – Month of data to download.

  • days (list) – List of days to download data for the given month.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • file_pattern (str) – Pattern for combined monthly output file. Must include year and month format keys. e.g. ‘era5_{year}_{month}_combined.nc’

  • overwrite (bool) – Whether to overwrite existing files.

  • variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

classmethod run_for_var(year, area, levels, monthly_file_pattern, yearly_file_pattern=None, months=None, days=None, overwrite=False, max_workers=None, variable=None, product_type='reanalysis', chunks='auto', res_kwargs=None)[source]#

Run routine for all requested months in the requested year for the given variable.

Parameters:
  • year (int) – Year of data to download.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • monthly_file_pattern (str) – Pattern for monthly output files. Must include year, month, and var format keys. e.g. ‘era5_{year}_{month}_{var}.nc’

  • yearly_file_pattern (str) – Pattern for yearly output files. Must include year and var format keys. e.g. ‘era5_{year}_{var}.nc’

  • months (list | None) – List of months to download data for. If None then all months for the given year will be downloaded.

  • days (list | None) – List of days to download data for. If None then all days for the given months will be downloaded. This should be a list of lists with an entry for each month. e.g. [[1, 2], [1, 2, 3]]

  • overwrite (bool) – Whether to overwrite existing files.

  • max_workers (int) – Max number of workers to use for downloading and processing monthly files.

  • variable (str) – Variable to download.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.

classmethod run(year, area, levels, monthly_file_pattern, yearly_file_pattern=None, months=None, days=None, overwrite=False, max_workers=None, variables=None, product_type='reanalysis', chunks='auto', combine_all_files=False, res_kwargs=None)[source]#

Run routine for all requested months in the requested year.

Parameters:
  • year (int) – Year of data to download.

  • area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]

  • levels (list) – List of pressure levels to download.

  • monthly_file_pattern (str) – Pattern for monthly output file. Must include year, month, and var format keys. e.g. ‘era5_{year}_{month}_{var}_combined.nc’

  • yearly_file_pattern (str) – Pattern for yearly output file. Must include year and var format keys. e.g. ‘era5_{year}_{var}_combined.nc’

  • months (list | None) – List of months to download data for. If None then all months for the given year will be downloaded.

  • days (list | None) – List of days to download data for. If None then all days for the given months will be downloaded. This should be a list of lists with an entry for each month. e.g. [[1, 2], [1, 2, 3]]

  • overwrite (bool) – Whether to overwrite existing files.

  • max_workers (int) – Max number of workers to use for downloading and processing monthly files.

  • variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.

  • product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’

  • combine_all_files (bool) – Whether to combine separate yearly variable files into a single yearly file with all variables included

classmethod make_yearly_var_file(year, monthly_file_pattern, yearly_file_pattern, variable, chunks='auto', res_kwargs=None)[source]#

Combine monthly variable files into a single yearly variable file.

Parameters:
  • year (int) – Year used to download data

  • monthly_file_pattern (str) – File pattern for monthly variable files. Must have year, month, and var format keys. e.g. ‘./era_{year}_{month}_{var}_combined.nc’

  • yearly_file_pattern (str) – File pattern for yearly variable files. Must have year and var format keys. e.g. ‘./era_{year}_{var}_combined.nc’

  • variable (string) – Variable name for the files to be combined.

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.

  • res_kwargs (None | dict) – Keyword arguments for base resource handler, like xr.open_mfdataset. This is passed to a Loader object and then used in the base loader contained by that obkect.

classmethod make_yearly_file(year, file_pattern, variables, chunks='auto', res_kwargs=None)[source]#

Combine yearly variable files into a single file.

Parameters:
  • year (int) – Year for the data to make into a yearly file.

  • file_pattern (str) – File pattern for output files. Must have year and var format keys. e.g. ‘./era_{year}_{var}_combined.nc’

  • variables (list) – List of variables corresponding to the yearly variable files to combine.

  • chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.

  • res_kwargs (None | dict) – Keyword arguments for base resource handler, like xr.open_mfdataset. This is passed to a Loader object and then used in the base loader contained by that obkect.

classmethod run_qa(file, res_kwargs=None, log_file=None)[source]#

Check for NaN values and log min / max / mean / stds for all variables.