reV.qa_qc.summary.SummarizeH5
- class SummarizeH5(h5_file, group=None)[source]
Bases:
object
reV Summary data for QA/QC
- Parameters:
h5_file (str) – Path to .h5 file to summarize data from
group (str, optional) – Group within h5_file to summarize datasets for, by default None
Methods
run
(h5_file, out_dir[, group, dsets, ...])Summarize all datasets in h5_file and dump to out_dir
summarize_dset
(ds_name[, process_size, ...])Compute dataset summary.
summarize_means
([out_path])Add means datasets to meta data
Attributes
.h5 file path
- property h5_file
.h5 file path
- Returns:
str
- summarize_dset(ds_name, process_size=None, max_workers=None, out_path=None)[source]
Compute dataset summary. If dataset is 2D compute temporal statistics for each site
- Parameters:
ds_name (str) – Dataset name of interest
process_size (int, optional) – Number of sites to process at a time, by default None
max_workers (int, optional) – Number of workers to use in parallel, if 1 run in serial, if None use all available cores, by default None
out_path (str) – File path to save summary to
- Returns:
summary (pandas.DataFrame) – Summary summary for dataset
- summarize_means(out_path=None)[source]
Add means datasets to meta data
- Parameters:
out_path (str, optional) – Path to .csv file to save update meta data to, by default None
- Returns:
meta (pandas.DataFrame) – Meta data with means datasets added
- classmethod run(h5_file, out_dir, group=None, dsets=None, process_size=None, max_workers=None)[source]
Summarize all datasets in h5_file and dump to out_dir
- Parameters:
h5_file (str) – Path to .h5 file to summarize data from
out_dir (str) – Directory to dump summary .csv files to
group (str, optional) – Group within h5_file to summarize datasets for, by default None
dsets (str | list, optional) – Datasets to summarize, by default None
process_size (int, optional) – Number of sites to process at a time, by default None
max_workers (int, optional) – Number of workers to use when summarizing 2D datasets, by default None