Configuration Files =================== The Case Study config file only needs to be created once per supply chain or per case study. Several versions of the Scenario config file can be created per case study to explore the impacts of different cost and other scenarios. Case Study Config Template -------------------------- .. code-block:: yaml model_run: start_year: # Integer; Calendar year end_year: # Integer; Calendar year timesteps_per_year: # Integer; Units = timesteps min_lifespan: # Integer; Units = timesteps lcia_update: # Integer; Units = timesteps cg_update: # Integer; Units = timesteps cg_verbose: # Integer <= 2 save_cg_csv: # Boolean directories: # The required directories should all exist in the same directory where the config files are located. inputs_to_preprocessing: # Required inputs_optional: # Not required inputs: # Required generated: # Not required results: # Not required files: # All file names must include the extension, which is ".csv" unless otherwise noted. # Datasets that are preprocessed and/or used to generate input datasets. inputs_to_preprocessing: transportation_graph: node_locs: power_plant_locs: landfill_locs: other_facility_locs: capacity_projection: # This parameter should remain blank - it will be filled in with a value from the Scenario config file. # Datasets that can be provided as alternatives to programmatically generated datasets. inputs_optional: step_costs_custom: # An alternative to the generated step_costs file routes_custom: # An alternative to the generated routes_computed file stock_filename: # .p # Input datasets that do not require preprocessing inputs: lookup_facility_type: lookup_step_costs: lookup_steps: lookup_transpo_cost_methods: lookup_step_cost_methods: fac_edges: transpo_edges: route_pairs: component_material_mass: static_lci: uslci_tech: uslci_emission: uslci_process_adder: lci_activity_locations: emissions_lci: traci_lci: state_reeds_grid_mix: national_reeds_grid_mix: # Datasets and files generated internally as data storage and/or used for debugging. generated: costgraph_pickle: # .obj costgraph_csv: step_costs: locs: technology_data: routes_computed: intermediate_demand: lcia_to_des: lcia_shortcut_db: state_electricity_lci: national_electricity_lci: # Human-readable results files for diagnostic visualization and further analysis results: pathway_criterion_history: component_counts_plot: # .png material_mass_plot: # .png count_cumulative_histories: mass_cumulative_histories: lcia_facility_results: lcia_transpo_results: central_summary: Scenario Config Template ------------------------ The `cost uncertainty` dictionary (an element of the `circular_pathways` dictionary) structure can be adjusted based on the modeling requirements of a particular case study. The structure here can apply to cost models that depend linearly on time and can take on random or array-based uncertainty. .. code-block:: yaml flags: clear_results : # If True and results files already exist, move them to a sub-directory to avoid overwriting. compute_locations : # If True, generate a locations datafile from raw input files (e.g., LMOP, US Wind Turbine Database). run_routes : # If True, compute routing distances between all input locations. use_computed_routes : # If True, read in a pre-assembled routes file INSTEAD of generating a new routes file. initialize_costgraph : # If True, create a CostGraph instance from input data or an imported pickle file. location_filtering : # If True, all datasets will be filtered to include only the states listed below. distance_filtering : # If True, filter computed routes based on max distances in route_pairs file. pickle_costgraph : # If True, saves the CostGraph instance as a pickle file. generate_step_costs : # If True, supply chain costs for a facility type do not vary regionally. use_fixed_lifetime : # If True, fixed lifetimes are used instead of stochastic Weibull draws. use_lcia_shortcut : # If True, use the lca_db emission factors file instead of performing LCIA calculations where possible. scenario: name: # Scenario name capacity_projection: # Name of file with scenario-specific capacity projection data. states_included: # List of U.S. states to optionally filter facility locations. seed: # Random number generator seed electricity_mix_level : # Specify disaggregation for electricity grid mix data: "state" or "national" runs: # Number of model runs within this scenario to execute. circular_pathways: sc_begin: # Facility type where the supply chain "begins". Typically manufacturing or resource extraction. sc_end: # List of facility types where the supply chain "ends". sc_in_circ: # List of inflow circularity facility types that provide secondary material to the supply chain. sc_out_circ: # List of outflow circularity facility types that take in secondary material for recirculation. learning: # Dictionary of parameters for industrial learning-by-doing parameters. [facility type]: # Facility type to which this learning cost model applies. Repeat this block for every facility type with a learning model. component : # String; component type(s). initial cumul: # Initial cumulative production for this technology. cumul: # Leave blank: this value is filled in and updated during simulation. initial cost: # Processing cost (USD/mass) at the beginning of the model run. learn rate: # Rate at which industrial learning-by-doing reduces costs. Must be negative. steps: # List of processing steps where this cost model is applied. cost uncertainty: # Dictionary of probability distribution parameters for cost models. [process step]: # Name of process step for the cost model. uncertainty: # random or array to implement uncertainty; leave blank for no uncertainty. c: # c, loc, scale: Probability distribution parameter(s) for random uncertainty type; can be re-named depending on distribution. See https://docs.scipy.org/doc/scipy/reference/stats.html. loc: scale: value: # Leave blank: random draws are stored here during each model run. m: # m, b: Cost model parameter(s) for array uncertainty type; can be scalars or lists of equal length. b: path_split: # Dictionary defining any process steps where the material stream splits, e.g. for material losses. [process step]: # Name of process step where split occurs. fraction: # Float or list of floats; fraction of material sent to facility_1 type facility_1: # Downstream facility type where fraction of material is sent. facility_2: # Downstream facility type where 1 - fraction of material is sent. pass: # Facility type(s) to ignore in DES because material was sent there during the split. permanent_lifespan_facility: # Facility type(s) where material accumulates (e.g. landfills). vkmt : # Leave blank: this value is updated during simulation. component mass : # Leave blank: this value is updated during simulation. year : # Leave blank: this value is updated during simulation. technology_components: # Dictionary of information about the composition of a technology unit. circular_components: # List of technology components involved in the circular supply chain. component_list: # Dictionary of all technology components and the number of components in each unit. component_materials: # Dictionary listing the constituent materials in each component. component_fixed_lifetimes: # Dictionary with fixed lifetimes (years) of each component. component_weibull_params: # Dictionary with Weibull distribution parameters (L, K) of each component lifetime. substitution_rates: # Dictionary of materials substituted by circular components/materials and the substitution rates (kg/kg). Scenario Flags ^^^^^^^^^^^^^^ The set of Boolean flags at the top of the scenario configuration file control much of the preprocessing done to set up a CELAVI simulation. Additional explanations for each flag are provided here. * `clear_results` * When CELAVI is executed multiple times on the same machine, it will produce one or more sets of output files in the `results` directory (one set of results files is produced per model run). Set `clear_results` to True if you expect to be executing CELAVI more than once and do not want the results of each execution to be overwritten. * Results from the most recent CELAVI execution are always found in the `results` directory. * When `clear_results` is True, every CELAVI execution after the first one will produce an additional directory of results files, with "results-" and the current timestamp in the directory name. The contents of the new `results` directory is the output files from the *previous* CELAVI execution. * `compute_locations` * This flag controls whether the facility location and type dataset is assembled from raw location files before supply chain routes are found or the simulation begins. * If you have already manually assembled the facility location and type dataset for your supply chain, then this flag can be set to False. However, if the facility information to be used in your supply chain is coming from a database such as the U.S. Wind Turbine Database or the Landfill Methane Outreach Program, then setting `compute_locations` to True will assemble the complete facility dataset. * `run_routes` * When `run_routes` is True, then the facility locations and route pairs datasets will be used to identify pairs of facilities between which materials will be transported. The `Router` module is then used to calculate minimum-distance (on-road) routes between each facility pair. * Generating routes for a multi-state or national supply chain can be time consuming, depending on the number of facilities in a supply chain. If the underlying facility locations dataset is stable, then `run_routes` need be True only for one CELAVI execution. Future executions will use the same set of routes and there is no need to re-generate the routes dataset. * `use_computed_routes` * The user can bypass the built-in Router module and supply a custom routes dataset by setting `use_computed_routes` to False. In this case, the filename with the custom routes dataset must also be provided in the Case Study configuration file. * If `run_routes` is True, then `use_computed_routes` should also generally be True, unless the user is comparing results from two different routes datasets. * `initialize_costgraph` * The Cost Graph model is initialized from the facility locations dataset, the routes dataset, and several other datasets that define how facilities in the supply chain are interconnected. * While initializing the Cost Graph can be time consuming, it is recommended to keep `initialize_costgraph` set to True unless CELAVI is being executed with one model run per simulation and no changes in the input datasets or parameters are being made between executions. * When executing multiple runs per scenario, the Cost Graph model will only be initialized once, thus `initialize_costgraph` should be True in this case. * `location_filtering` * This flag can be used in combination with the `states_included` list under the `scenario` dictionary to filter down large input datasets to include only certain U.S. states (region_id_2, in the input datasets). One set of (for example) national-scale data can then be defined and filtered as needed, rather than developing separate datasets. * If `location_filtering` is True but there are no states listed under `states_included`, then a warning is printed and no filtering is performed. If `location_filtering` is False, then no filtering is performed even if states are listed under `states_included`. * Both the processed facility locations dataset and the routes dataset are filtered with this flag. * `distance_filtering` * When `distance_filtering` is True, the route pairs dataset is used to filter down the routes file and Cost Graph edges based on the `vkmt_max` column. This allows users to set a transportation distance limit, for instance for transportation to landfills, without having to manually remove unrealistically lengthy routes. * Some care should be taken in using `distance_filtering` and in setting the `vkmt_max` values. It's possible to filter out routes that must be included for the supply chain to be complete (e.g. routes to a power plant from a manufacturing facility), and in this case the filtering will produce an error during the CELAVI execution. * Any blank values in the `vkmt_max` column will be backfilled with a sufficiently large number that no routes will be filtered out, allowing for only routes between specific facility pairs to be filtered based on distance. * `pickle_costgraph` * When True, the `pickle_costgraph` flag will save (pickle) a copy of the initialized Cost Graph model as a Python object that can be examine or used outside the CELAVI execution. This can be useful for multiple repeated CELAVI executions. * `generate_step_costs` * The step costs dataset assigns processing cost methods (models) to every facility in the supply chain. Depending on how the processing costs vary with space and with facility, users may want to manually generate the step costs dataset or generate it automatically by setting `generate_step_costs` to True. * If this flag is True, the assumption is that processing costs *do not vary with facility location*, and more broadly that there is one (set of) processing cost methods per facility type. In the case that there are multiple processing cost methods for a single facility type - for instance, separate landfill tipping fee models by U.S. state or county - then `generate_step_costs` must be set to False and the step costs dataset generated manually. * `use_fixed_lifetime` * Technology components remain "in use" for a period of time before entering the end-of-life phase. The time "in use" is the component lifetime, which for each component type can be modeled either as a fixed value or as random draws from a Weibull distribution. Both the fixed values and the Weibull parameters are defined by component type in the Scenario configuration file. * Set `use_fixed_lifetime` to True to use a fixed, deterministic lifetime for every technology component, or set to False to generate lifetimes from the Weibull distributions. * If `use_fixed_lifetime` is set to False, it is recommended that users also set the `seed` value under the `scenario` dictionary. This will generate stochastic results that are reproducible in repeated CELAVI executions. * `use_lcia_shortcut` * Repeatedly performing LCIA calculations can lengthen CELAVI run time considerably. To speed up the calculations, `use_lcia_shortcut` can be set to True to use precomputed emission factors stored in a local file. If this file does not yet exist, then LCIA calculations are performed normally and the file is populated with emission factors as they are calculated. * When performing multiple model runs in a single CELAVI execution, it is strongly recommended to set `use_lcia_shortcut` to True to shorten the run time. * After changes to the scenario parameters or to the input datasets, it is recommended to delete the local emission factors file to avoid using incorrect factors. Cost Uncertainty Modeling ^^^^^^^^^^^^^^^^^^^^^^^^^ There is a great deal of flexibility in how uncertainty is defined within the cost models. This leads to many possible versions of the "cost uncertainty" dictionary within the Scenario YAML file. This section discusses the three main options for implementing uncertainty and gives examples of how to define each type of uncertainty within CELAVI. **No uncertainty**: In this case, there is no uncertainty represented in a cost model. Scalar values are defined for each cost model parameter, and a single run is sufficient to quantify the results. In this case, the `uncertainty` key within the cost model dictionary will be left blank, and whatever parameters the cost model requires are defined as floats. For example, the landfilling cost model, which is represented as a linear equation with slope *m* and y-intercept *b*, has the following dictionary when no uncertainty is represented: .. code-block:: yaml cost uncertainty: landfilling: uncertainty: # Left blank m: 1.5921 # Single, scalar value for slope parameter b: 28.9 # Single, scalar value for y-intercept parameter **Array- or range-based uncertainty**: In this case, parameters with uncertainty are defined with lists of floats, and one model run is executed per element of that list. When modeling this type of uncertainty in multiple parameters simultaneously, care must be taken that the lists of parameter values are all of the same length *and* that the number of runs to execute is equal to this length. An error will be thrown if more runs are executed than there are elements in the parameter lists or if the lists are of unequal length, and the simulation will not completed. The landfilling cost model dictionary has the following structure when array-based uncertainty is implemented for the slope parameter *m*: .. code-block:: yaml cost uncertainty: landfilling: uncertainty: array m: - 0.0 - 0.64 - 1.27 - 1.91 - 2.55 - 3.18 b: 28.9 If both the *m* and *b* parameters are modeled with array-based uncertainty, the dictionary would be as follows. Note that both parameters have value lists of length 6. The `runs` parameter under the `scenario` dictionary in this case would have to be set to 6 as well. .. code-block:: yaml cost uncertainty: landfilling: uncertainty: array m: - 0.0 - 0.64 - 1.27 - 1.91 - 2.55 - 3.18 b: - 0.0 - 11.56 - 23.12 - 34.68 - 46.24 - 57.8 **Stochastic uncertainty**: Using this type of uncertainty requires defining probability distributions on the cost model parameters. By default, CELAVI uses triangular distributions with parameters `c`, `loc`, and `scale`. These distribution parameters must be defined as scalars, and a blank key called `value` must also be included. The cost model parameter value, once drawn from the distribution, is stored under `value` for the duration of a model run. The landfilling cost model dictionary with stochastic uncertainty on both *m* and *b* has the following structure: .. code-block:: yaml cost uncertainty: landfilling: uncertainty: stochastic m: c: 0.430 loc: 0.0 scale: 3.704 value: b: c: 0.430 loc: 0.0 scale: 67.244 value: Note that the *m* and *b* parameters are no longer defined explicitly when using stochastic uncertainty. Case Study Config Example ------------------------- .. code-block:: yaml model_run: start_year: 2000 end_year: 2051 timesteps_per_year: 12 min_lifespan: 120 # timesteps lcia_update: 12 # timesteps lcia_verbose: 0 cg_update: 12 #timesteps cg_verbose: 1 save_cg_csv: True directories: inputs_to_preprocessing: inputs_to_preprocessing/ inputs_optional: inputs_optional/ inputs: inputs/ generated: generated/ results: results/ files: # Files that must be processed to create CELAVI input files inputs_to_preprocessing: transportation_graph: transportation_graph.csv node_locs: node_locations.csv power_plant_locs: uswtdb_v4_1_20210721.csv landfill_locs: landfilllmopdata.csv other_facility_locs: other_facility_locations_all_us.csv capacity_projection: # Inputs that are alternatives to programmatically generated inputs inputs_optional: step_costs_custom: step_costs_custom.csv # an alternative to the generated step_costs file routes_custom: routes.csv # an alternative to the generated routes_computed file stock_filename: stock_filename.p # Files used directly as CELAVI inputs inputs: lookup_facility_type: facility_type.csv lookup_step_costs: step_costs_default.csv lookup_steps: step.csv lookup_transpo_cost_methods: transpo_cost_method.csv lookup_step_cost_methods: step_cost_method.csv fac_edges: fac_edges.csv transpo_edges: transpo_edges.csv route_pairs: route_pairs.csv component_material_mass: avgmass.csv static_lci: foreground_process_inventory.csv uslci_tech: tech_matrix_corr.csv uslci_emission: process_emissions_corr.csv uslci_process_adder: process_names_adder.csv lci_activity_locations: location.csv emissions_lci: emissions_inventory.csv traci_lci: traci21.csv state_reeds_grid_mix: state_dynamic_grid_mix.csv national_reeds_grid_mix: national_dynamic_grid_mix.csv # Files written during CELAVI runs intended only for internal or debugging use generated: costgraph_pickle: netw.obj costgraph_csv: netw.csv step_costs: step_costs.csv locs: locations_computed.csv technology_data: number_of_technology_units.csv routes_computed: routes_computed.csv intermediate_demand: intermediate_demand.csv lcia_to_des: final_lcia_results_to_des.csv lcia_shortcut_db: lca_db.csv state_electricity_lci: state_level_grid_mix.csv national_electricity_lci: national_level_grid_mix.csv # Human-readable results files for visualization and further analysis results: pathway_criterion_history: pathway_criterion_history.csv component_counts_plot: component_counts.png material_mass_plot: material_mass.png count_cumulative_histories: count_cumulative_histories.csv mass_cumulative_histories: mass_cumulative_histories.csv lcia_facility_results: lcia_locations_join.csv lcia_transpo_results: lcia_transportation.csv central_summary: central_summary.csv Scenario Config Example ------------------------ .. code-block:: yaml flags: clear_results : True # If True and results files already exist, move them to a sub-directory to avoid overwriting. compute_locations : True # If True, generate a locations datafile from raw input files (e.g., LMOP, US Wind Turbine Database). run_routes : True # If True, compute routing distances between all input locations. use_computed_routes : True # If True, read in a pre-assembled routes file INSTEAD of generating a new routes file. initialize_costgraph : True # If True, create a CostGraph instance from input data or an imported pickle file. location_filtering : False # If True, all datasets will be filtered to include only the states listed below. distance_filtering : False # If True, filter computed routes based on max distances in route_pairs file. pickle_costgraph : True # If True, saves the CostGraph instance as a pickle file. generate_step_costs : True # If True, supply chain costs for a facility type do not vary regionally. use_fixed_lifetime : True # If True, fixed lifetimes are used instead of stochastic Weibull draws. use_lcia_shortcut : True # If True, use the lca_db emission factors file instead of performing LCIA calculations where possible. scenario: name: Wind Blade EOL Management, National capacity_projection: StScen20A_MidCase_annual_state.csv states_included: seed: 13 electricity_mix_level : state runs: 1 circular_pathways: sc_begin: - manufacturing sc_end: - landfilling #sc_in_circ: sc_out_circ: - cement co-processing - next use learning: coarse grinding: component : blade initial cumul: 1.0 cumul: learn rate: -0.05 steps: - coarse grinding - coarse grinding onsite fine grinding: component : blade initial cumul: 1.0 cumul: learn rate: -0.05 steps: - fine grinding cost uncertainty: landfilling: uncertainty: m: 1.5921 b: 28.9 rotor teardown: uncertainty: m: 1467.08 b: 285.0 segmenting: uncertainty: b: 27.56 coarse grinding onsite: uncertainty: initial cost: 106 coarse grinding: uncertainty: initial cost: 106 fine grinding: uncertainty: initial cost: 143 revenue: 273 coprocessing: uncertainty: b: 10.37 segment transpo: uncertainty: cost 1: 4.35 # Before 2001; 2002-2003 cost 2: 8.70 # 2001-2002; 2003-2019 cost 3: 13.05 # 2019-2031 cost 4: 17.40 # 2031-2044 cost 5: 21.75 # 2044-2050 shred transpo: uncertainty: m: 0.0011221 b: 0.0524 manufacturing: uncertainty: b: 11440.0 path_split: fine grinding: fraction: 0.3 facility_1: landfilling facility_2: next use pass: next use permanent_lifespan_facility: - landfilling - cement co-processing - next use vkmt : component mass : year : technology_components: circular_components: - blade component_list: nacelle : 1 blade : 3 tower : 1 foundation : 1 component_materials: nacelle : - steel blade : - glass fiber - epoxy tower : - steel foundation : - concrete component_fixed_lifetimes: # Years nacelle : 30 blade : 20 foundation : 50 tower : 50 component_weibull_params: #L, K nacelle : blade : L : 240 K : 2.2 foundation : tower : substitution_rates: sand: 0.15 coal: 0.30