reV.qa_qc.summary.SummarizeH5

class SummarizeH5(h5_file, group=None)[source]

reV Summary data for QA/QC

Parameters:

h5_file (str) – Path to .h5 file to summarize data from
group (str, optional) – Group within h5_file to summarize datasets for, by default None

Methods

`run`(h5_file, out_dir[, group, dsets, ...])	Summarize all datasets in h5_file and dump to out_dir
`summarize_dset`(ds_name[, process_size, ...])	Compute dataset summary.
`summarize_means`([out_path])	Add means datasets to meta data

Attributes

.h5 file path

property h5_file

.h5 file path

summarize_dset(ds_name, process_size=None, max_workers=None, out_path=None)[source]

Compute dataset summary. If dataset is 2D compute temporal statistics for each site

Parameters:

ds_name (str) – Dataset name of interest
process_size (int, optional) – Number of sites to process at a time, by default None
max_workers (int, optional) – Number of workers to use in parallel, if 1 run in serial, if None use all available cores, by default None
out_path (str) – File path to save summary to

Returns:

summary (pandas.DataFrame) – Summary summary for dataset

summarize_means(out_path=None)[source]

Add means datasets to meta data

Parameters:: out_path (str, optional) – Path to .csv file to save update meta data to, by default None
Returns:: meta (pandas.DataFrame) – Meta data with means datasets added

classmethod run(h5_file, out_dir, group=None, dsets=None, process_size=None, max_workers=None)[source]

Summarize all datasets in h5_file and dump to out_dir

Parameters:

h5_file (str) – Path to .h5 file to summarize data from
out_dir (str) – Directory to dump summary .csv files to
group (str, optional) – Group within h5_file to summarize datasets for, by default None
dsets (str | list, optional) – Datasets to summarize, by default None
process_size (int, optional) – Number of sites to process at a time, by default None
max_workers (int, optional) – Number of workers to use when summarizing 2D datasets, by default None