dwind.utils#
Modules
Provides a series of generic NumPy and Pandas utility functions. |
|
Provides the live timing table functionalities for the Kestrel |
|
Provides the core data loading methods for importing scenario data from flat files or SQL. |
Array#
Provides a series of generic NumPy and Pandas utility functions.
Functions
Downcasts |
|
|
Split a DataFrame, Series, or array like with np.array_split, but only return the start and stop indices, rather than chunks. |
- dwind.utils.array.memory_downcaster(df)[source]#
Downcasts
int
andfloat
columns to the lowest memory alternative possible. For integers this means converting to either signed or unsigned 8-, 16-, 32-, or 64-bit integers, and for floats, converting tonp.float32
.- Parameters:
df (pd.DataFrame | pd.Series) – DataFrame or Series to have its memory footprint reduced.
- Returns:
Reduced footprint version of the passed
df
.- Return type:
pd.DataFrame | pd.Series
- dwind.utils.array.split_by_index(arr, n_splits)[source]#
Split a DataFrame, Series, or array like with np.array_split, but only return the start and stop indices, rather than chunks. For Pandas objects, this are equivalent to
arr.iloc[start: end]
and for NumPy:arr[start: end]
. Splits are done according to the 0th dimension.- Parameters:
arr (pd.DataFrame | pd.Series | np.ndarray) – The array, data frame, or series to split.
n_splits (
int
) – The number of near equal or equal splits.
- Return type:
tuple
[ndarray
,ndarray
]- Returns:
tuple[np.ndarray, np.ndarray]
HPC#
Provides the live timing table functionalities for the Kestrel MultiProcess
class.
Functions
Convert number of seconds to number of hours, minutes, and seconds. |
|
|
Generate the job status run time statistics table. |
|
Extracts a dictionary of job_id and status from the |
|
Get an updated status and timing statistics for all running jobs on the HPC. |
- dwind.utils.hpc.convert_seconds_for_print(time)[source]#
Convert number of seconds to number of hours, minutes, and seconds.
- Return type:
str
- dwind.utils.hpc.generate_run_status_table(job_status)[source]#
Generate the job status run time statistics table.
- Parameters:
job_status (dict) – Dictionary of job id (primary key) with sub keys of “status”, “start_time” (initial or start of run status), “wait”, and “run”.
- Returns:
rich.Table
of human readable statistics. bool: True if all jobs are complete, otherwise False.- Return type:
Table
- dwind.utils.hpc.get_finished_run_status(jobs)[source]#
Extracts a dictionary of job_id and status from the
sacct
output for a single job or series of jobs.- Parameters:
jobs (int | str | list[int | str]) – Single job ID or list of job IDs that have finished running.
- Returns:
Dictionary of {job_id_1: status_1, …, job_id_N: status_N}.
- Return type:
dict[str, str]
- dwind.utils.hpc.update_status(job_status)[source]#
Get an updated status and timing statistics for all running jobs on the HPC.
- Parameters:
job_status (dict) – Dictionary of job id (primary key) with sub keys of “status”, “start_time” (initial or start of run status), “wait”, and “run”.
- Returns:
- Dictionary of updated statuses and timing statistics for all current queued and
running jobs.
- Return type:
dict
Loader#
Provides the core data loading methods for importing scenario data from flat files or SQL.
Functions
|
Loads data from either a SQL table or file to a pandas |
- dwind.utils.loader.load_df(file_or_table, year, sql_constructor=None)[source]#
Loads data from either a SQL table or file to a pandas
DataFrame
.- Parameters:
file_or_table (str | Path) – File name or path object, or SQL table where the data are located.
year (
dwind.config.Year
, optional) – If used, only extracts the single year fromNone. (a column called "year". Defaults to)
sql_constructor (str | None, optional) – The SQL engine constructor string. Required if extracting from SQL. Defaults to None.