sup3r.utilities.era_downloader.EraDownloader#
- class EraDownloader(year, month, area, levels, file_pattern, overwrite=False, variables=None, product_type='reanalysis')[source]#
Bases:
object
Class to handle ERA5 downloading, variable renaming, and file combinations.
Initialize the class.
- Parameters:
year (int) – Year of data to download.
month (int) – Month of data to download.
area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]
levels (list) – List of pressure levels to download.
file_pattern (str) – Pattern for combined monthly output file. Must include year and month format keys. e.g. ‘era5_{year}_{month}_combined.nc’
overwrite (bool) – Whether to overwrite existing files.
variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.
product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’
Methods
add_pressure
(ds)Add pressure to dataset
all_vars_exist
(year, file_pattern, variables)Check if all yearly variable files for the requested year exist.
convert_z
(ds, name)Convert z to given height variable
download_file
(variables, time_dict, area, ...)Download either single-level or pressure-level file
Run the download routine.
Get the copernicus climate data store (CDS) API object for ERA downloads.
ERA5 is hourly and EDA is 3-hourly.
Download level and surface files, process variables, and combine processed files.
get_tmp_file
(file)Get temp file for given file.
make_yearly_file
(year, file_pattern, variables)Combine yearly variable files into a single file.
make_yearly_var_file
(year, ...[, chunks, ...])Combine monthly variable files into a single yearly variable file.
prep_var_lists
(variables)Create surface and level variable lists based on requested variables.
Process variables and combine.
Convert geopotential to geopotential height.
Rename variables and convert geopotential to geopotential height.
run
(year, area, levels, monthly_file_pattern)Run routine for all requested months in the requested year.
run_for_var
(year, area, levels, ...[, ...])Run routine for all requested months in the requested year for the given variable.
run_month
(year, month, area, levels, ...[, ...])Run routine for the given month and year.
run_qa
(file[, res_kwargs, log_file])Check for NaN values and log min / max / mean / stds for all variables.
Attributes
Get list of days for the requested month
Get name of file with variables from pressure level download
Name of file with all surface and level variables for a given month and year.
Get name of file with variables from single level download
Get list of requested variables
- property variables#
Get list of requested variables
- property days#
Get list of days for the requested month
- property monthly_file#
Name of file with all surface and level variables for a given month and year.
- property surface_file#
Get name of file with variables from single level download
- property level_file#
Get name of file with variables from pressure level download
- classmethod get_tmp_file(file)[source]#
Get temp file for given file. Then only needed variables will be written to the given file.
- prep_var_lists(variables)[source]#
Create surface and level variable lists based on requested variables.
- static get_cds_client()[source]#
Get the copernicus climate data store (CDS) API object for ERA downloads.
- classmethod download_file(variables, time_dict, area, out_file, level_type, levels=None, product_type='reanalysis', overwrite=False)[source]#
Download either single-level or pressure-level file
- Parameters:
variables (list) – List of variables to download
time_dict (dict) – Dictionary with year, month, day, time entries.
area (list) – List of bounding box coordinates. e.g. [max_lat, min_lon, min_lat, max_lon]
out_file (str) – Name of output file
level_type (str) – Either ‘single’ or ‘pressure’
levels (list) – List of pressure levels to download, if level_type == ‘pressure’
product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’
overwrite (bool) – Whether to overwrite existing file
- add_pressure(ds)[source]#
Add pressure to dataset
- Parameters:
ds (Dataset) – xr.Dataset() object for which to add pressure
- Returns:
ds (Dataset)
- convert_z(ds, name)[source]#
Convert z to given height variable
- Parameters:
ds (Dataset) – xr.Dataset() object for new file
name (str) – Variable name. e.g. zg or orog, typically
- Returns:
ds (Dataset) – xr.Dataset() object for new file with new height variable written.
- get_monthly_file()[source]#
Download level and surface files, process variables, and combine processed files. Includes checks for shape and variables.
- classmethod all_vars_exist(year, file_pattern, variables)[source]#
Check if all yearly variable files for the requested year exist.
- Parameters:
year (int) – Year used for data download.
file_pattern (str) – Pattern for variable file. Must include year and var format keys. e.g. ‘era5_{year}_{var}_combined.nc’
variables (list) – Variables that should have been downloaded
- Returns:
bool – True if all monthly variable files for the requested year and month exist.
- classmethod run_month(year, month, area, levels, file_pattern, overwrite=False, variables=None, product_type='reanalysis')[source]#
Run routine for the given month and year.
- Parameters:
year (int) – Year of data to download.
month (int) – Month of data to download.
area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]
levels (list) – List of pressure levels to download.
file_pattern (str) – Pattern for combined monthly output file. Must include year and month format keys. e.g. ‘era5_{year}_{month}_combined.nc’
overwrite (bool) – Whether to overwrite existing files.
variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.
product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’
- classmethod run_for_var(year, area, levels, monthly_file_pattern, yearly_file_pattern=None, months=None, overwrite=False, max_workers=None, variable=None, product_type='reanalysis', chunks='auto', res_kwargs=None)[source]#
Run routine for all requested months in the requested year for the given variable.
- Parameters:
year (int) – Year of data to download.
area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]
levels (list) – List of pressure levels to download.
monthly_file_pattern (str) – Pattern for monthly output files. Must include year, month, and var format keys. e.g. ‘era5_{year}_{month}_{var}.nc’
yearly_file_pattern (str) – Pattern for yearly output files. Must include year and var format keys. e.g. ‘era5_{year}_{var}.nc’
months (list | None) – List of months to download data for. If None then all months for the given year will be downloaded.
overwrite (bool) – Whether to overwrite existing files.
max_workers (int) – Max number of workers to use for downloading and processing monthly files.
variable (str) – Variable to download.
product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’
chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.
- classmethod run(year, area, levels, monthly_file_pattern, yearly_file_pattern=None, months=None, overwrite=False, max_workers=None, variables=None, product_type='reanalysis', chunks='auto', combine_all_files=False, res_kwargs=None)[source]#
Run routine for all requested months in the requested year.
- Parameters:
year (int) – Year of data to download.
area (list) – Domain area of the data to download. [max_lat, min_lon, min_lat, max_lon]
levels (list) – List of pressure levels to download.
monthly_file_pattern (str) – Pattern for monthly output file. Must include year, month, and var format keys. e.g. ‘era5_{year}_{month}_{var}_combined.nc’
yearly_file_pattern (str) – Pattern for yearly output file. Must include year and var format keys. e.g. ‘era5_{year}_{var}_combined.nc’
months (list | None) – List of months to download data for. If None then all months for the given year will be downloaded.
overwrite (bool) – Whether to overwrite existing files.
max_workers (int) – Max number of workers to use for downloading and processing monthly files.
variables (list | None) – Variables to download. If None this defaults to just gepotential and wind components.
product_type (str) – Can be ‘reanalysis’, ‘ensemble_mean’, ‘ensemble_spread’, ‘ensemble_members’, ‘monthly_averaged_reanalysis’, ‘monthly_averaged_ensemble_members’
chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’
combine_all_files (bool) – Whether to combine separate yearly variable files into a single yearly file with all variables included
- classmethod make_yearly_var_file(year, monthly_file_pattern, yearly_file_pattern, variable, chunks='auto', res_kwargs=None)[source]#
Combine monthly variable files into a single yearly variable file.
- Parameters:
year (int) – Year used to download data
monthly_file_pattern (str) – File pattern for monthly variable files. Must have year, month, and var format keys. e.g. ‘./era_{year}_{month}_{var}_combined.nc’
yearly_file_pattern (str) – File pattern for yearly variable files. Must have year and var format keys. e.g. ‘./era_{year}_{var}_combined.nc’
variable (string) – Variable name for the files to be combined.
chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.
res_kwargs (None | dict) – Keyword arguments for base resource handler, like
xr.open_mfdataset.
This is passed to aLoader
object and then used in the base loader contained by that obkect.
- classmethod make_yearly_file(year, file_pattern, variables, chunks='auto', res_kwargs=None)[source]#
Combine yearly variable files into a single file.
- Parameters:
year (int) – Year for the data to make into a yearly file.
file_pattern (str) – File pattern for output files. Must have year and var format keys. e.g. ‘./era_{year}_{var}_combined.nc’
variables (list) – List of variables corresponding to the yearly variable files to combine.
chunks (str | dict) – Dictionary of chunksizes used when writing data to netcdf files. Can also be ‘auto’.
res_kwargs (None | dict) – Keyword arguments for base resource handler, like
xr.open_mfdataset.
This is passed to aLoader
object and then used in the base loader contained by that obkect.