Configuration Files¶
The Case Study config file only needs to be created once per supply chain or per case study.
Several versions of the Scenario config file can be created per case study to explore the impacts of different cost and other scenarios.
Case Study Config Template¶
model_run:
start_year: # Integer; Calendar year
end_year: # Integer; Calendar year
timesteps_per_year: # Integer; Units = timesteps
min_lifespan: # Integer; Units = timesteps
lcia_update: # Integer; Units = timesteps
cg_update: # Integer; Units = timesteps
cg_verbose: # Integer <= 2
save_cg_csv: # Boolean
directories:
# The required directories should all exist in the same directory where the config files are located.
inputs_to_preprocessing: # Required
inputs_optional: # Not required
inputs: # Required
generated: # Not required
results: # Not required
files:
# All file names must include the extension, which is ".csv" unless otherwise noted.
# Datasets that are preprocessed and/or used to generate input datasets.
inputs_to_preprocessing:
transportation_graph:
node_locs:
power_plant_locs:
landfill_locs:
other_facility_locs:
capacity_projection: # This parameter should remain blank - it will be filled in with a value from the Scenario config file.
# Datasets that can be provided as alternatives to programmatically generated datasets.
inputs_optional:
step_costs_custom: # An alternative to the generated step_costs file
routes_custom: # An alternative to the generated routes_computed file
stock_filename: # .p
# Input datasets that do not require preprocessing
inputs:
lookup_facility_type:
lookup_step_costs:
lookup_steps:
lookup_transpo_cost_methods:
lookup_step_cost_methods:
fac_edges:
transpo_edges:
route_pairs:
component_material_mass:
static_lci:
uslci_tech:
uslci_emission:
uslci_process_adder:
lci_activity_locations:
emissions_lci:
traci_lci:
state_reeds_grid_mix:
national_reeds_grid_mix:
# Datasets and files generated internally as data storage and/or used for debugging.
generated:
costgraph_pickle: # .obj
costgraph_csv:
step_costs:
locs:
technology_data:
routes_computed:
intermediate_demand:
lcia_to_des:
lcia_shortcut_db:
state_electricity_lci:
national_electricity_lci:
# Human-readable results files for diagnostic visualization and further analysis
results:
pathway_criterion_history:
component_counts_plot: # .png
material_mass_plot: # .png
count_cumulative_histories:
mass_cumulative_histories:
lcia_facility_results:
lcia_transpo_results:
central_summary:
Scenario Config Template¶
The cost uncertainty dictionary (an element of the circular_pathways dictionary) structure can be adjusted based on the modeling requirements of a particular case study. The structure here can apply to cost models that depend linearly on time and can take on random or array-based uncertainty.
flags:
clear_results : # If True and results files already exist, move them to a sub-directory to avoid overwriting.
compute_locations : # If True, generate a locations datafile from raw input files (e.g., LMOP, US Wind Turbine Database).
run_routes : # If True, compute routing distances between all input locations.
use_computed_routes : # If True, read in a pre-assembled routes file INSTEAD of generating a new routes file.
initialize_costgraph : # If True, create a CostGraph instance from input data or an imported pickle file.
location_filtering : # If True, all datasets will be filtered to include only the states listed below.
distance_filtering : # If True, filter computed routes based on max distances in route_pairs file.
pickle_costgraph : # If True, saves the CostGraph instance as a pickle file.
generate_step_costs : # If True, supply chain costs for a facility type do not vary regionally.
use_fixed_lifetime : # If True, fixed lifetimes are used instead of stochastic Weibull draws.
use_lcia_shortcut : # If True, use the lca_db emission factors file instead of performing LCIA calculations where possible.
scenario:
name: # Scenario name
capacity_projection: # Name of file with scenario-specific capacity projection data.
states_included: # List of U.S. states to optionally filter facility locations.
seed: # Random number generator seed
electricity_mix_level : # Specify disaggregation for electricity grid mix data: "state" or "national"
runs: # Number of model runs within this scenario to execute.
circular_pathways:
sc_begin: # Facility type where the supply chain "begins". Typically manufacturing or resource extraction.
sc_end: # List of facility types where the supply chain "ends".
sc_in_circ: # List of inflow circularity facility types that provide secondary material to the supply chain.
sc_out_circ: # List of outflow circularity facility types that take in secondary material for recirculation.
learning: # Dictionary of parameters for industrial learning-by-doing parameters.
[facility type]: # Facility type to which this learning cost model applies. Repeat this block for every facility type with a learning model.
component : # String; component type(s).
initial cumul: # Initial cumulative production for this technology.
cumul: # Leave blank: this value is filled in and updated during simulation.
initial cost: # Processing cost (USD/mass) at the beginning of the model run.
learn rate: # Rate at which industrial learning-by-doing reduces costs. Must be negative.
steps: # List of processing steps where this cost model is applied.
cost uncertainty: # Dictionary of probability distribution parameters for cost models.
[process step]: # Name of process step for the cost model.
uncertainty: # random or array to implement uncertainty; leave blank for no uncertainty.
c: # c, loc, scale: Probability distribution parameter(s) for random uncertainty type; can be re-named depending on distribution. See https://docs.scipy.org/doc/scipy/reference/stats.html.
loc:
scale:
value: # Leave blank: random draws are stored here during each model run.
m: # m, b: Cost model parameter(s) for array uncertainty type; can be scalars or lists of equal length.
b:
path_split: # Dictionary defining any process steps where the material stream splits, e.g. for material losses.
[process step]: # Name of process step where split occurs.
fraction: # Float or list of floats; fraction of material sent to facility_1 type
facility_1: # Downstream facility type where fraction of material is sent.
facility_2: # Downstream facility type where 1 - fraction of material is sent.
pass: # Facility type(s) to ignore in DES because material was sent there during the split.
permanent_lifespan_facility: # Facility type(s) where material accumulates (e.g. landfills).
vkmt : # Leave blank: this value is updated during simulation.
component mass : # Leave blank: this value is updated during simulation.
year : # Leave blank: this value is updated during simulation.
technology_components: # Dictionary of information about the composition of a technology unit.
circular_components: # List of technology components involved in the circular supply chain.
component_list: # Dictionary of all technology components and the number of components in each unit.
component_materials: # Dictionary listing the constituent materials in each component.
component_fixed_lifetimes: # Dictionary with fixed lifetimes (years) of each component.
component_weibull_params: # Dictionary with Weibull distribution parameters (L, K) of each component lifetime.
substitution_rates: # Dictionary of materials substituted by circular components/materials and the substitution rates (kg/kg).
Scenario Flags¶
The set of Boolean flags at the top of the scenario configuration file control much of the preprocessing done to set up a CELAVI simulation. Additional explanations for each flag are provided here.
- clear_results
When CELAVI is executed multiple times on the same machine, it will produce one or more sets of output files in the results directory (one set of results files is produced per model run). Set clear_results to True if you expect to be executing CELAVI more than once and do not want the results of each execution to be overwritten.
Results from the most recent CELAVI execution are always found in the results directory.
When clear_results is True, every CELAVI execution after the first one will produce an additional directory of results files, with “results-” and the current timestamp in the directory name. The contents of the new results directory is the output files from the previous CELAVI execution.
- compute_locations
This flag controls whether the facility location and type dataset is assembled from raw location files before supply chain routes are found or the simulation begins.
If you have already manually assembled the facility location and type dataset for your supply chain, then this flag can be set to False. However, if the facility information to be used in your supply chain is coming from a database such as the U.S. Wind Turbine Database or the Landfill Methane Outreach Program, then setting compute_locations to True will assemble the complete facility dataset.
- run_routes
When run_routes is True, then the facility locations and route pairs datasets will be used to identify pairs of facilities between which materials will be transported. The Router module is then used to calculate minimum-distance (on-road) routes between each facility pair.
Generating routes for a multi-state or national supply chain can be time consuming, depending on the number of facilities in a supply chain. If the underlying facility locations dataset is stable, then run_routes need be True only for one CELAVI execution. Future executions will use the same set of routes and there is no need to re-generate the routes dataset.
- use_computed_routes
The user can bypass the built-in Router module and supply a custom routes dataset by setting use_computed_routes to False. In this case, the filename with the custom routes dataset must also be provided in the Case Study configuration file.
If run_routes is True, then use_computed_routes should also generally be True, unless the user is comparing results from two different routes datasets.
- initialize_costgraph
The Cost Graph model is initialized from the facility locations dataset, the routes dataset, and several other datasets that define how facilities in the supply chain are interconnected.
While initializing the Cost Graph can be time consuming, it is recommended to keep initialize_costgraph set to True unless CELAVI is being executed with one model run per simulation and no changes in the input datasets or parameters are being made between executions.
When executing multiple runs per scenario, the Cost Graph model will only be initialized once, thus initialize_costgraph should be True in this case.
- location_filtering
This flag can be used in combination with the states_included list under the scenario dictionary to filter down large input datasets to include only certain U.S. states (region_id_2, in the input datasets). One set of (for example) national-scale data can then be defined and filtered as needed, rather than developing separate datasets.
If location_filtering is True but there are no states listed under states_included, then a warning is printed and no filtering is performed. If location_filtering is False, then no filtering is performed even if states are listed under states_included.
Both the processed facility locations dataset and the routes dataset are filtered with this flag.
- distance_filtering
When distance_filtering is True, the route pairs dataset is used to filter down the routes file and Cost Graph edges based on the vkmt_max column. This allows users to set a transportation distance limit, for instance for transportation to landfills, without having to manually remove unrealistically lengthy routes.
Some care should be taken in using distance_filtering and in setting the vkmt_max values. It’s possible to filter out routes that must be included for the supply chain to be complete (e.g. routes to a power plant from a manufacturing facility), and in this case the filtering will produce an error during the CELAVI execution.
Any blank values in the vkmt_max column will be backfilled with a sufficiently large number that no routes will be filtered out, allowing for only routes between specific facility pairs to be filtered based on distance.
- pickle_costgraph
When True, the pickle_costgraph flag will save (pickle) a copy of the initialized Cost Graph model as a Python object that can be examine or used outside the CELAVI execution. This can be useful for multiple repeated CELAVI executions.
- generate_step_costs
The step costs dataset assigns processing cost methods (models) to every facility in the supply chain. Depending on how the processing costs vary with space and with facility, users may want to manually generate the step costs dataset or generate it automatically by setting generate_step_costs to True.
If this flag is True, the assumption is that processing costs do not vary with facility location, and more broadly that there is one (set of) processing cost methods per facility type. In the case that there are multiple processing cost methods for a single facility type - for instance, separate landfill tipping fee models by U.S. state or county - then generate_step_costs must be set to False and the step costs dataset generated manually.
- use_fixed_lifetime
Technology components remain “in use” for a period of time before entering the end-of-life phase. The time “in use” is the component lifetime, which for each component type can be modeled either as a fixed value or as random draws from a Weibull distribution. Both the fixed values and the Weibull parameters are defined by component type in the Scenario configuration file.
Set use_fixed_lifetime to True to use a fixed, deterministic lifetime for every technology component, or set to False to generate lifetimes from the Weibull distributions.
If use_fixed_lifetime is set to False, it is recommended that users also set the seed value under the scenario dictionary. This will generate stochastic results that are reproducible in repeated CELAVI executions.
- use_lcia_shortcut
Repeatedly performing LCIA calculations can lengthen CELAVI run time considerably. To speed up the calculations, use_lcia_shortcut can be set to True to use precomputed emission factors stored in a local file. If this file does not yet exist, then LCIA calculations are performed normally and the file is populated with emission factors as they are calculated.
When performing multiple model runs in a single CELAVI execution, it is strongly recommended to set use_lcia_shortcut to True to shorten the run time.
After changes to the scenario parameters or to the input datasets, it is recommended to delete the local emission factors file to avoid using incorrect factors.
Cost Uncertainty Modeling¶
There is a great deal of flexibility in how uncertainty is defined within the cost models. This leads to many possible versions of the “cost uncertainty” dictionary within the Scenario YAML file. This section discusses the three main options for implementing uncertainty and gives examples of how to define each type of uncertainty within CELAVI.
No uncertainty: In this case, there is no uncertainty represented in a cost model. Scalar values are defined for each cost model parameter, and a single run is sufficient to quantify the results. In this case, the uncertainty key within the cost model dictionary will be left blank, and whatever parameters the cost model requires are defined as floats. For example, the landfilling cost model, which is represented as a linear equation with slope m and y-intercept b, has the following dictionary when no uncertainty is represented:
cost uncertainty:
landfilling:
uncertainty: # Left blank
m: 1.5921 # Single, scalar value for slope parameter
b: 28.9 # Single, scalar value for y-intercept parameter
Array- or range-based uncertainty: In this case, parameters with uncertainty are defined with lists of floats, and one model run is executed per element of that list. When modeling this type of uncertainty in multiple parameters simultaneously, care must be taken that the lists of parameter values are all of the same length and that the number of runs to execute is equal to this length. An error will be thrown if more runs are executed than there are elements in the parameter lists or if the lists are of unequal length, and the simulation will not completed. The landfilling cost model dictionary has the following structure when array-based uncertainty is implemented for the slope parameter m:
cost uncertainty:
landfilling:
uncertainty: array
m:
- 0.0
- 0.64
- 1.27
- 1.91
- 2.55
- 3.18
b: 28.9
If both the m and b parameters are modeled with array-based uncertainty, the dictionary would be as follows. Note that both parameters have value lists of length 6. The runs parameter under the scenario dictionary in this case would have to be set to 6 as well.
cost uncertainty:
landfilling:
uncertainty: array
m:
- 0.0
- 0.64
- 1.27
- 1.91
- 2.55
- 3.18
b:
- 0.0
- 11.56
- 23.12
- 34.68
- 46.24
- 57.8
Stochastic uncertainty: Using this type of uncertainty requires defining probability distributions on the cost model parameters. By default, CELAVI uses triangular distributions with parameters c, loc, and scale. These distribution parameters must be defined as scalars, and a blank key called value must also be included. The cost model parameter value, once drawn from the distribution, is stored under value for the duration of a model run. The landfilling cost model dictionary with stochastic uncertainty on both m and b has the following structure:
cost uncertainty:
landfilling:
uncertainty: stochastic
m:
c: 0.430
loc: 0.0
scale: 3.704
value:
b:
c: 0.430
loc: 0.0
scale: 67.244
value:
Note that the m and b parameters are no longer defined explicitly when using stochastic uncertainty.
Case Study Config Example¶
model_run:
start_year: 2000
end_year: 2051
timesteps_per_year: 12
min_lifespan: 120 # timesteps
lcia_update: 12 # timesteps
lcia_verbose: 0
cg_update: 12 #timesteps
cg_verbose: 1
save_cg_csv: True
directories:
inputs_to_preprocessing: inputs_to_preprocessing/
inputs_optional: inputs_optional/
inputs: inputs/
generated: generated/
results: results/
files:
# Files that must be processed to create CELAVI input files
inputs_to_preprocessing:
transportation_graph: transportation_graph.csv
node_locs: node_locations.csv
power_plant_locs: uswtdb_v4_1_20210721.csv
landfill_locs: landfilllmopdata.csv
other_facility_locs: other_facility_locations_all_us.csv
capacity_projection:
# Inputs that are alternatives to programmatically generated inputs
inputs_optional:
step_costs_custom: step_costs_custom.csv # an alternative to the generated step_costs file
routes_custom: routes.csv # an alternative to the generated routes_computed file
stock_filename: stock_filename.p
# Files used directly as CELAVI inputs
inputs:
lookup_facility_type: facility_type.csv
lookup_step_costs: step_costs_default.csv
lookup_steps: step.csv
lookup_transpo_cost_methods: transpo_cost_method.csv
lookup_step_cost_methods: step_cost_method.csv
fac_edges: fac_edges.csv
transpo_edges: transpo_edges.csv
route_pairs: route_pairs.csv
component_material_mass: avgmass.csv
static_lci: foreground_process_inventory.csv
uslci_tech: tech_matrix_corr.csv
uslci_emission: process_emissions_corr.csv
uslci_process_adder: process_names_adder.csv
lci_activity_locations: location.csv
emissions_lci: emissions_inventory.csv
traci_lci: traci21.csv
state_reeds_grid_mix: state_dynamic_grid_mix.csv
national_reeds_grid_mix: national_dynamic_grid_mix.csv
# Files written during CELAVI runs intended only for internal or debugging use
generated:
costgraph_pickle: netw.obj
costgraph_csv: netw.csv
step_costs: step_costs.csv
locs: locations_computed.csv
technology_data: number_of_technology_units.csv
routes_computed: routes_computed.csv
intermediate_demand: intermediate_demand.csv
lcia_to_des: final_lcia_results_to_des.csv
lcia_shortcut_db: lca_db.csv
state_electricity_lci: state_level_grid_mix.csv
national_electricity_lci: national_level_grid_mix.csv
# Human-readable results files for visualization and further analysis
results:
pathway_criterion_history: pathway_criterion_history.csv
component_counts_plot: component_counts.png
material_mass_plot: material_mass.png
count_cumulative_histories: count_cumulative_histories.csv
mass_cumulative_histories: mass_cumulative_histories.csv
lcia_facility_results: lcia_locations_join.csv
lcia_transpo_results: lcia_transportation.csv
central_summary: central_summary.csv
Scenario Config Example¶
flags:
clear_results : True # If True and results files already exist, move them to a sub-directory to avoid overwriting.
compute_locations : True # If True, generate a locations datafile from raw input files (e.g., LMOP, US Wind Turbine Database).
run_routes : True # If True, compute routing distances between all input locations.
use_computed_routes : True # If True, read in a pre-assembled routes file INSTEAD of generating a new routes file.
initialize_costgraph : True # If True, create a CostGraph instance from input data or an imported pickle file.
location_filtering : False # If True, all datasets will be filtered to include only the states listed below.
distance_filtering : False # If True, filter computed routes based on max distances in route_pairs file.
pickle_costgraph : True # If True, saves the CostGraph instance as a pickle file.
generate_step_costs : True # If True, supply chain costs for a facility type do not vary regionally.
use_fixed_lifetime : True # If True, fixed lifetimes are used instead of stochastic Weibull draws.
use_lcia_shortcut : True # If True, use the lca_db emission factors file instead of performing LCIA calculations where possible.
scenario:
name: Wind Blade EOL Management, National
capacity_projection: StScen20A_MidCase_annual_state.csv
states_included:
seed: 13
electricity_mix_level : state
runs: 1
circular_pathways:
sc_begin:
- manufacturing
sc_end:
- landfilling
#sc_in_circ:
sc_out_circ:
- cement co-processing
- next use
learning:
coarse grinding:
component : blade
initial cumul: 1.0
cumul:
learn rate: -0.05
steps:
- coarse grinding
- coarse grinding onsite
fine grinding:
component : blade
initial cumul: 1.0
cumul:
learn rate: -0.05
steps:
- fine grinding
cost uncertainty:
landfilling:
uncertainty:
m: 1.5921
b: 28.9
rotor teardown:
uncertainty:
m: 1467.08
b: 285.0
segmenting:
uncertainty:
b: 27.56
coarse grinding onsite:
uncertainty:
initial cost: 106
coarse grinding:
uncertainty:
initial cost: 106
fine grinding:
uncertainty:
initial cost: 143
revenue: 273
coprocessing:
uncertainty:
b: 10.37
segment transpo:
uncertainty:
cost 1: 4.35 # Before 2001; 2002-2003
cost 2: 8.70 # 2001-2002; 2003-2019
cost 3: 13.05 # 2019-2031
cost 4: 17.40 # 2031-2044
cost 5: 21.75 # 2044-2050
shred transpo:
uncertainty:
m: 0.0011221
b: 0.0524
manufacturing:
uncertainty:
b: 11440.0
path_split:
fine grinding:
fraction: 0.3
facility_1: landfilling
facility_2: next use
pass:
next use
permanent_lifespan_facility:
- landfilling
- cement co-processing
- next use
vkmt :
component mass :
year :
technology_components:
circular_components:
- blade
component_list:
nacelle : 1
blade : 3
tower : 1
foundation : 1
component_materials:
nacelle :
- steel
blade :
- glass fiber
- epoxy
tower :
- steel
foundation :
- concrete
component_fixed_lifetimes: # Years
nacelle : 30
blade : 20
foundation : 50
tower : 50
component_weibull_params: #L, K
nacelle :
blade :
L : 240
K : 2.2
foundation :
tower :
substitution_rates:
sand: 0.15
coal: 0.30