Configuration Files

The Case Study config file only needs to be created once per supply chain or per case study.

Several versions of the Scenario config file can be created per case study to explore the impacts of different cost and other scenarios.

Case Study Config Template

model_run:
        start_year:          # Integer; Calendar year
        end_year:            # Integer; Calendar year
        timesteps_per_year:  # Integer; Units = timesteps
        min_lifespan:        # Integer; Units = timesteps
        lcia_update:         # Integer; Units = timesteps
        cg_update:           # Integer; Units = timesteps
        cg_verbose:          # Integer <= 2
        save_cg_csv:         # Boolean

directories:
        # The required directories should all exist in the same directory where the config files are located.
        inputs_to_preprocessing: # Required
        inputs_optional:         # Not required
        inputs:                  # Required
        generated:               # Not required
        results:                 # Not required

files:
        # All file names must include the extension, which is ".csv" unless otherwise noted.

        # Datasets that are preprocessed and/or used to generate input datasets.
        inputs_to_preprocessing:
                transportation_graph:
                node_locs:
                power_plant_locs:
                landfill_locs:
                other_facility_locs:
                capacity_projection: # This parameter should remain blank - it will be filled in with a value from the Scenario config file.

        # Datasets that can be provided as alternatives to programmatically generated datasets.
        inputs_optional:
                step_costs_custom: # An alternative to the generated step_costs file
                routes_custom:     # An alternative to the generated routes_computed file
                stock_filename:    # .p

        # Input datasets that do not require preprocessing
        inputs:
                lookup_facility_type:
                lookup_step_costs:
                lookup_steps:
                lookup_transpo_cost_methods:
                lookup_step_cost_methods:
                fac_edges:
                transpo_edges:
                route_pairs:
                component_material_mass:
                static_lci:
                uslci_tech:
                uslci_emission:
                uslci_process_adder:
                lci_activity_locations:
                emissions_lci:
                traci_lci:
                state_reeds_grid_mix:
                national_reeds_grid_mix:

        # Datasets and files generated internally as data storage and/or used for debugging.
        generated:
                costgraph_pickle: # .obj
                costgraph_csv:
                step_costs:
                locs:
                technology_data:
                routes_computed:
                intermediate_demand:
                lcia_to_des:
                lcia_shortcut_db:
                state_electricity_lci:
                national_electricity_lci:

        # Human-readable results files for diagnostic visualization and further analysis
        results:
                pathway_criterion_history:
                component_counts_plot: # .png
                material_mass_plot: # .png
                count_cumulative_histories:
                mass_cumulative_histories:
                lcia_facility_results:
                lcia_transpo_results:
                central_summary:

Scenario Config Template

The cost uncertainty dictionary (an element of the circular_pathways dictionary) structure can be adjusted based on the modeling requirements of a particular case study. The structure here can apply to cost models that depend linearly on time and can take on random or array-based uncertainty.

flags:
        clear_results         :   # If True and results files already exist, move them to a sub-directory to avoid overwriting.
        compute_locations     :   # If True, generate a locations datafile from raw input files (e.g., LMOP, US Wind Turbine Database).
        run_routes            :   # If True, compute routing distances between all input locations.
        use_computed_routes   :   # If True, read in a pre-assembled routes file INSTEAD of generating a new routes file.
        initialize_costgraph  :   # If True, create a CostGraph instance from input data or an imported pickle file.
        location_filtering    :   # If True, all datasets will be filtered to include only the states listed below.
        distance_filtering    :   # If True, filter computed routes based on max distances in route_pairs file.
        pickle_costgraph      :   # If True, saves the CostGraph instance as a pickle file.
        generate_step_costs   :   # If True, supply chain costs for a facility type do not vary regionally.
        use_fixed_lifetime    :   # If True, fixed lifetimes are used instead of stochastic Weibull draws.
        use_lcia_shortcut     :   # If True, use the lca_db emission factors file instead of performing LCIA calculations where possible.

scenario:
        name:                    # Scenario name
        capacity_projection:     # Name of file with scenario-specific capacity projection data.
        states_included:         # List of U.S. states to optionally filter facility locations.
        seed:                    # Random number generator seed
        electricity_mix_level :  # Specify disaggregation for electricity grid mix data: "state" or "national"
        runs:                    # Number of model runs within this scenario to execute.

circular_pathways:
        sc_begin:               # Facility type where the supply chain "begins". Typically manufacturing or resource extraction.
        sc_end:                 # List of facility types where the supply chain "ends".
        sc_in_circ:             # List of inflow circularity facility types that provide secondary material to the supply chain.
        sc_out_circ:            # List of outflow circularity facility types that take in secondary material for recirculation.
        learning:               # Dictionary of parameters for industrial learning-by-doing parameters.
                [facility type]:    # Facility type to which this learning cost model applies. Repeat this block for every facility type with a learning model.
                        component :     # String; component type(s).
                        initial cumul:  # Initial cumulative production for this technology.
                        cumul:          # Leave blank: this value is filled in and updated during simulation.
                        initial cost:   # Processing cost (USD/mass) at the beginning of the model run.
                        learn rate:     # Rate at which industrial learning-by-doing reduces costs. Must be negative.
                        steps:          # List of processing steps where this cost model is applied.
        cost uncertainty:       # Dictionary of probability distribution parameters for cost models.
                [process step]:     # Name of process step for the cost model.
                        uncertainty:    # random or array to implement uncertainty; leave blank for no uncertainty.
                        c:              # c, loc, scale: Probability distribution parameter(s) for random uncertainty type; can be re-named depending on distribution. See https://docs.scipy.org/doc/scipy/reference/stats.html.
                        loc:
                        scale:
                        value:          # Leave blank: random draws are stored here during each model run.
                        m:              # m, b: Cost model parameter(s) for array uncertainty type; can be scalars or lists of equal length.
                        b:
        path_split:             # Dictionary defining any process steps where the material stream splits, e.g. for material losses.
                [process step]:     # Name of process step where split occurs.
                        fraction:       # Float or list of floats; fraction of material sent to facility_1 type
                        facility_1:     # Downstream facility type where fraction of material is sent.
                        facility_2:     # Downstream facility type where 1 - fraction of material is sent.
                pass:               # Facility type(s) to ignore in DES because material was sent there during the split.
        permanent_lifespan_facility:  # Facility type(s) where material accumulates (e.g. landfills).
        vkmt :                        # Leave blank: this value is updated during simulation.
        component mass :              # Leave blank: this value is updated during simulation.
        year :                        # Leave blank: this value is updated during simulation.


technology_components:         # Dictionary of information about the composition of a technology unit.
        circular_components:       # List of technology components involved in the circular supply chain.
        component_list:            # Dictionary of all technology components and the number of components in each unit.
        component_materials:       # Dictionary listing the constituent materials in each component.
        component_fixed_lifetimes: # Dictionary with fixed lifetimes (years) of each component.
        component_weibull_params:  # Dictionary with Weibull distribution parameters (L, K) of each component lifetime.
        substitution_rates:        # Dictionary of materials substituted by circular components/materials and the substitution rates (kg/kg).

Scenario Flags

The set of Boolean flags at the top of the scenario configuration file control much of the preprocessing done to set up a CELAVI simulation. Additional explanations for each flag are provided here.

  • clear_results
    • When CELAVI is executed multiple times on the same machine, it will produce one or more sets of output files in the results directory (one set of results files is produced per model run). Set clear_results to True if you expect to be executing CELAVI more than once and do not want the results of each execution to be overwritten.

    • Results from the most recent CELAVI execution are always found in the results directory.

    • When clear_results is True, every CELAVI execution after the first one will produce an additional directory of results files, with “results-” and the current timestamp in the directory name. The contents of the new results directory is the output files from the previous CELAVI execution.

  • compute_locations
    • This flag controls whether the facility location and type dataset is assembled from raw location files before supply chain routes are found or the simulation begins.

    • If you have already manually assembled the facility location and type dataset for your supply chain, then this flag can be set to False. However, if the facility information to be used in your supply chain is coming from a database such as the U.S. Wind Turbine Database or the Landfill Methane Outreach Program, then setting compute_locations to True will assemble the complete facility dataset.

  • run_routes
    • When run_routes is True, then the facility locations and route pairs datasets will be used to identify pairs of facilities between which materials will be transported. The Router module is then used to calculate minimum-distance (on-road) routes between each facility pair.

    • Generating routes for a multi-state or national supply chain can be time consuming, depending on the number of facilities in a supply chain. If the underlying facility locations dataset is stable, then run_routes need be True only for one CELAVI execution. Future executions will use the same set of routes and there is no need to re-generate the routes dataset.

  • use_computed_routes
    • The user can bypass the built-in Router module and supply a custom routes dataset by setting use_computed_routes to False. In this case, the filename with the custom routes dataset must also be provided in the Case Study configuration file.

    • If run_routes is True, then use_computed_routes should also generally be True, unless the user is comparing results from two different routes datasets.

  • initialize_costgraph
    • The Cost Graph model is initialized from the facility locations dataset, the routes dataset, and several other datasets that define how facilities in the supply chain are interconnected.

    • While initializing the Cost Graph can be time consuming, it is recommended to keep initialize_costgraph set to True unless CELAVI is being executed with one model run per simulation and no changes in the input datasets or parameters are being made between executions.

    • When executing multiple runs per scenario, the Cost Graph model will only be initialized once, thus initialize_costgraph should be True in this case.

  • location_filtering
    • This flag can be used in combination with the states_included list under the scenario dictionary to filter down large input datasets to include only certain U.S. states (region_id_2, in the input datasets). One set of (for example) national-scale data can then be defined and filtered as needed, rather than developing separate datasets.

    • If location_filtering is True but there are no states listed under states_included, then a warning is printed and no filtering is performed. If location_filtering is False, then no filtering is performed even if states are listed under states_included.

    • Both the processed facility locations dataset and the routes dataset are filtered with this flag.

  • distance_filtering
    • When distance_filtering is True, the route pairs dataset is used to filter down the routes file and Cost Graph edges based on the vkmt_max column. This allows users to set a transportation distance limit, for instance for transportation to landfills, without having to manually remove unrealistically lengthy routes.

    • Some care should be taken in using distance_filtering and in setting the vkmt_max values. It’s possible to filter out routes that must be included for the supply chain to be complete (e.g. routes to a power plant from a manufacturing facility), and in this case the filtering will produce an error during the CELAVI execution.

    • Any blank values in the vkmt_max column will be backfilled with a sufficiently large number that no routes will be filtered out, allowing for only routes between specific facility pairs to be filtered based on distance.

  • pickle_costgraph
    • When True, the pickle_costgraph flag will save (pickle) a copy of the initialized Cost Graph model as a Python object that can be examine or used outside the CELAVI execution. This can be useful for multiple repeated CELAVI executions.

  • generate_step_costs
    • The step costs dataset assigns processing cost methods (models) to every facility in the supply chain. Depending on how the processing costs vary with space and with facility, users may want to manually generate the step costs dataset or generate it automatically by setting generate_step_costs to True.

    • If this flag is True, the assumption is that processing costs do not vary with facility location, and more broadly that there is one (set of) processing cost methods per facility type. In the case that there are multiple processing cost methods for a single facility type - for instance, separate landfill tipping fee models by U.S. state or county - then generate_step_costs must be set to False and the step costs dataset generated manually.

  • use_fixed_lifetime
    • Technology components remain “in use” for a period of time before entering the end-of-life phase. The time “in use” is the component lifetime, which for each component type can be modeled either as a fixed value or as random draws from a Weibull distribution. Both the fixed values and the Weibull parameters are defined by component type in the Scenario configuration file.

    • Set use_fixed_lifetime to True to use a fixed, deterministic lifetime for every technology component, or set to False to generate lifetimes from the Weibull distributions.

    • If use_fixed_lifetime is set to False, it is recommended that users also set the seed value under the scenario dictionary. This will generate stochastic results that are reproducible in repeated CELAVI executions.

  • use_lcia_shortcut
    • Repeatedly performing LCIA calculations can lengthen CELAVI run time considerably. To speed up the calculations, use_lcia_shortcut can be set to True to use precomputed emission factors stored in a local file. If this file does not yet exist, then LCIA calculations are performed normally and the file is populated with emission factors as they are calculated.

    • When performing multiple model runs in a single CELAVI execution, it is strongly recommended to set use_lcia_shortcut to True to shorten the run time.

    • After changes to the scenario parameters or to the input datasets, it is recommended to delete the local emission factors file to avoid using incorrect factors.

Cost Uncertainty Modeling

There is a great deal of flexibility in how uncertainty is defined within the cost models. This leads to many possible versions of the “cost uncertainty” dictionary within the Scenario YAML file. This section discusses the three main options for implementing uncertainty and gives examples of how to define each type of uncertainty within CELAVI.

No uncertainty: In this case, there is no uncertainty represented in a cost model. Scalar values are defined for each cost model parameter, and a single run is sufficient to quantify the results. In this case, the uncertainty key within the cost model dictionary will be left blank, and whatever parameters the cost model requires are defined as floats. For example, the landfilling cost model, which is represented as a linear equation with slope m and y-intercept b, has the following dictionary when no uncertainty is represented:

cost uncertainty:
        landfilling:
                uncertainty: # Left blank
                m: 1.5921    # Single, scalar value for slope parameter
                b: 28.9      # Single, scalar value for y-intercept parameter

Array- or range-based uncertainty: In this case, parameters with uncertainty are defined with lists of floats, and one model run is executed per element of that list. When modeling this type of uncertainty in multiple parameters simultaneously, care must be taken that the lists of parameter values are all of the same length and that the number of runs to execute is equal to this length. An error will be thrown if more runs are executed than there are elements in the parameter lists or if the lists are of unequal length, and the simulation will not completed. The landfilling cost model dictionary has the following structure when array-based uncertainty is implemented for the slope parameter m:

cost uncertainty:
        landfilling:
                uncertainty: array
                m:
                - 0.0
                - 0.64
                - 1.27
                - 1.91
                - 2.55
                - 3.18
                b: 28.9

If both the m and b parameters are modeled with array-based uncertainty, the dictionary would be as follows. Note that both parameters have value lists of length 6. The runs parameter under the scenario dictionary in this case would have to be set to 6 as well.

cost uncertainty:
        landfilling:
                uncertainty: array
                m:
                - 0.0
                - 0.64
                - 1.27
                - 1.91
                - 2.55
                - 3.18
                b:
                - 0.0
                - 11.56
                - 23.12
                - 34.68
                - 46.24
                - 57.8

Stochastic uncertainty: Using this type of uncertainty requires defining probability distributions on the cost model parameters. By default, CELAVI uses triangular distributions with parameters c, loc, and scale. These distribution parameters must be defined as scalars, and a blank key called value must also be included. The cost model parameter value, once drawn from the distribution, is stored under value for the duration of a model run. The landfilling cost model dictionary with stochastic uncertainty on both m and b has the following structure:

cost uncertainty:
    landfilling:
        uncertainty: stochastic
        m:
            c: 0.430
            loc: 0.0
            scale: 3.704
            value:
        b:
            c: 0.430
            loc: 0.0
            scale: 67.244
            value:

Note that the m and b parameters are no longer defined explicitly when using stochastic uncertainty.

Case Study Config Example

model_run:
        start_year: 2000
        end_year: 2051
        timesteps_per_year: 12
        min_lifespan: 120 # timesteps
        lcia_update: 12 # timesteps
        lcia_verbose: 0
        cg_update: 12 #timesteps
        cg_verbose: 1
        save_cg_csv: True

directories:
        inputs_to_preprocessing: inputs_to_preprocessing/
        inputs_optional: inputs_optional/
        inputs: inputs/
        generated: generated/
        results: results/

files:
        # Files that must be processed to create CELAVI input files
        inputs_to_preprocessing:
                transportation_graph: transportation_graph.csv
                node_locs: node_locations.csv
                power_plant_locs: uswtdb_v4_1_20210721.csv
                landfill_locs: landfilllmopdata.csv
                other_facility_locs: other_facility_locations_all_us.csv
                capacity_projection:

        # Inputs that are alternatives to programmatically generated inputs
        inputs_optional:
                step_costs_custom: step_costs_custom.csv # an alternative to the generated step_costs file
                routes_custom: routes.csv # an alternative to the generated routes_computed file
                stock_filename: stock_filename.p

        # Files used directly as CELAVI inputs
        inputs:
                lookup_facility_type: facility_type.csv
                lookup_step_costs: step_costs_default.csv
                lookup_steps: step.csv
                lookup_transpo_cost_methods: transpo_cost_method.csv
                lookup_step_cost_methods: step_cost_method.csv
                fac_edges: fac_edges.csv
                transpo_edges: transpo_edges.csv
                route_pairs: route_pairs.csv
                component_material_mass: avgmass.csv
                static_lci: foreground_process_inventory.csv
                uslci_tech: tech_matrix_corr.csv
                uslci_emission: process_emissions_corr.csv
                uslci_process_adder: process_names_adder.csv
                lci_activity_locations: location.csv
                emissions_lci: emissions_inventory.csv
                traci_lci: traci21.csv
                state_reeds_grid_mix: state_dynamic_grid_mix.csv
                national_reeds_grid_mix: national_dynamic_grid_mix.csv

        # Files written during CELAVI runs intended only for internal or debugging use
        generated:
                costgraph_pickle: netw.obj
                costgraph_csv: netw.csv
                step_costs: step_costs.csv
                locs: locations_computed.csv
                technology_data: number_of_technology_units.csv
                routes_computed: routes_computed.csv
                intermediate_demand: intermediate_demand.csv
                lcia_to_des: final_lcia_results_to_des.csv
                lcia_shortcut_db: lca_db.csv
                state_electricity_lci: state_level_grid_mix.csv
                national_electricity_lci: national_level_grid_mix.csv

        # Human-readable results files for visualization and further analysis
        results:
                pathway_criterion_history: pathway_criterion_history.csv
                component_counts_plot: component_counts.png
                material_mass_plot: material_mass.png
                count_cumulative_histories: count_cumulative_histories.csv
                mass_cumulative_histories: mass_cumulative_histories.csv
                lcia_facility_results: lcia_locations_join.csv
                lcia_transpo_results: lcia_transportation.csv
                central_summary: central_summary.csv

Scenario Config Example

flags:
  clear_results         : True   # If True and results files already exist, move them to a sub-directory to avoid overwriting.
  compute_locations     : True   # If True, generate a locations datafile from raw input files (e.g., LMOP, US Wind Turbine Database).
  run_routes            : True   # If True, compute routing distances between all input locations.
  use_computed_routes   : True   # If True, read in a pre-assembled routes file INSTEAD of generating a new routes file.
  initialize_costgraph  : True   # If True, create a CostGraph instance from input data or an imported pickle file.
  location_filtering    : False  # If True, all datasets will be filtered to include only the states listed below.
  distance_filtering    : False  # If True, filter computed routes based on max distances in route_pairs file.
  pickle_costgraph      : True   # If True, saves the CostGraph instance as a pickle file.
  generate_step_costs   : True   # If True, supply chain costs for a facility type do not vary regionally.
  use_fixed_lifetime    : True   # If True, fixed lifetimes are used instead of stochastic Weibull draws.
  use_lcia_shortcut     : True   # If True, use the lca_db emission factors file instead of performing LCIA calculations where possible.


scenario:
        name: Wind Blade EOL Management, National
        capacity_projection: StScen20A_MidCase_annual_state.csv
        states_included:
        seed: 13
        electricity_mix_level : state
        runs: 1

circular_pathways:
        sc_begin:
        - manufacturing
        sc_end:
        - landfilling
        #sc_in_circ:
        sc_out_circ:
        - cement co-processing
        - next use
        learning:
                coarse grinding:
                        component : blade
                        initial cumul: 1.0
                        cumul:
                        learn rate: -0.05
                        steps:
                        - coarse grinding
                        - coarse grinding onsite
                fine grinding:
                        component : blade
                        initial cumul: 1.0
                        cumul:
                        learn rate: -0.05
                        steps:
                        - fine grinding
        cost uncertainty:
                landfilling:
                        uncertainty:
                        m: 1.5921
                        b: 28.9
                rotor teardown:
                        uncertainty:
                        m: 1467.08
                        b: 285.0
                segmenting:
                        uncertainty:
                        b: 27.56
                coarse grinding onsite:
                        uncertainty:
                        initial cost: 106
                coarse grinding:
                        uncertainty:
                        initial cost: 106
                fine grinding:
                        uncertainty:
                        initial cost: 143
                        revenue: 273
                coprocessing:
                        uncertainty:
                        b: 10.37
                segment transpo:
                        uncertainty:
                        cost 1: 4.35 # Before 2001; 2002-2003
                        cost 2: 8.70 # 2001-2002; 2003-2019
                        cost 3: 13.05 # 2019-2031
                        cost 4: 17.40 # 2031-2044
                        cost 5: 21.75 # 2044-2050
                shred transpo:
                        uncertainty:
                        m: 0.0011221
                        b: 0.0524
                manufacturing:
                        uncertainty:
                        b: 11440.0
        path_split:
                fine grinding:
                        fraction: 0.3
                        facility_1: landfilling
                        facility_2: next use
                pass:
                        next use
        permanent_lifespan_facility:
        - landfilling
        - cement co-processing
        - next use
        vkmt :
        component mass :
        year :


technology_components:
        circular_components:
        - blade
        component_list:
                nacelle : 1
                blade : 3
                tower : 1
                foundation : 1
        component_materials:
                nacelle :
                - steel
                blade :
                - glass fiber
                - epoxy
                tower :
                - steel
                foundation :
                - concrete
        component_fixed_lifetimes: # Years
                nacelle : 30
                blade : 20
                foundation : 50
                tower : 50
        component_weibull_params: #L, K
                nacelle :
                blade :
                        L : 240
                        K : 2.2
                foundation :
                tower :
        substitution_rates:
                sand: 0.15
                coal: 0.30