Skip to content

Loading data into database

First step in using ERAD is preparing the dataset to load into the database. This section describes data format and the process for loading datasets into database.

Distribution Feeder Data

One of the key dataset for simulation is distribution feeder data. You need to prepare csv files for distribution assets which will be used to bulk load the data into Neo4J database using Cypher Query. Here is a list of csv files you need to prepare for distribution system. You can also use the sample datasets available in the repository.

File Name Description Columns
buses.csv CSV file containing information about distribution system buses. Required. name, longitude, latitude, kv
loads.csv CSV file containing information about power system loads. Required. source refer to bus name from buses.csv file where load is connected to and critical_load_factor is any value between 0 and 1 that is used to determine critical demand for this load. name, kw, kvar, source, critical_load_factor
transformers.csv CSV file containing information about transformers. Required. source and target refer to bus names from buses.csv file where transformer's primary and secondary terminal is connected to. height_m is height of transformer from ground surface in meters. name, source, target, kva, height_m
line_sections.csv CSV file containing information about line sections. Required. source and target refer to bus names from buses.csv file where line segment's to and from node is connected to. height_m is height of line segment from ground surface in meters and geom_type could be either overhead or underground. name, source, target, geom_type, ampacity, height_m, num_phase
pv_systems.csv CSV file containing information about PV systems. Optional. owner refers to load name from load.csv file who owns the solar, bus refers to bus name from buses.csv file where solar is connected to and capacity refers to solar installation capacity in kW. name, owner, bus, capacity
energy_storage.csv CSV file containing information about energy storages. Optional. owner refers to load name from load.csv file who owns the energy storage, bus refers to bus name from buses.csv file where energy storage is connected to, and kw refers to maximum kw capacity for storage name, owner, bus, kw
substation.csv CSV file containing information about distribution substation. Required. name

If you are using OpenDSS models, ERAD includes utility functions to extract required csv files from the model.

Following code snippet downloads one of the SMARTDS feeder dataset and extracts asset data in csv file format in assets folder.

from erad.utils.ditto_utils import download_smartds_data
from erad.utils.opendss_utils import extract_export_opendss_model

download_smartds_data('P3R', '.')
extract_export_opendss_model('P3R__2018__SFO__oedi-data-lake_opendss_no_loadshapes/Master.dss',
                              './assets')

You will see buses.csv, capacitors.csv, line_sections.csv, load.csv, pv_systems.csv, and transformers.csv. Notice that the extraction process can generate extra files not used by ERAD and may extract more information than you need and this is okay as long as csv files have all the necessary columns.

Critical Infrastructure Data

Along with distribution feeder data, ERAD also requires critical infrastructure data. Here is a list of csv files you need to prepare for critical infrastructures. You do not need to have all the dataset and can start with any subset of these.

File Name Description Columns
medical_centers.csv CSV file containing information about medicial centers within the region of interest. Optional. gid, name, source, longitude, latitude, kw, kvar, backup_capacity_kw, backup
shelters.csv CSV file containing information about shelters. Optional. gid, name, source, longitude, latitude, kw, kvar, backup_capacity_kw, backup
pharmacies.csv CSV file containing information about pharmacies within the region of interest. Optional. gid, name, source, longitude, latitude, kw, kvar, backup_capacity_kw, backup
groceries.csv CSV file containing information about groceries within the region of interest. Optional. gid, name, source, longitude, latitude, kw, kvar, backup_capacity_kw, backup
banking.csv CSV file containing information about banks within the region of interest. Optional. gid, name, source, longitude, latitude, kw, kvar, backup_capacity_kw, backup
convenience.csv CSV file containing information about convenience stores within the region of interest. Optional. gid, name, source, longitude, latitude, kw, kvar, backup_capacity_kw, backup

gid refers to any unique global identifier, source is bus name from buses.csv file where it is connected to in the, distribution grid, name is a friendly name for infrastructure, kw is the active peak load for the infrastructure, kvaris the reactive peak load for the infrastructure, backup_capacity_kw is the backup generating source capacity for the infrastructure in kw, and backup is binary 0 or 1

One of the dataset you can use to get critical infrastructure data is HIFLD dataset.

Infrastrcuture Name Data sources
Hospitals HIFLD Open (Hospitals for all US)
Shelters HIFLD Open (National Shelter System Facilities)
Pharmacies HIFLD Open (Pharmacies for all US)
Banking HIFLD Open (FDIC Insured Banks)

Bulk loading data from csv files into Neo4J database

Once you have prepared all the csv files, you will need to copy these into /import folder if you are using Neo4J docker container. If you are using Neo4J desktop, you will need to find import folder for specific database you have created. Please take a look at the Neo4j doc for more details on how to bulk load csv files.

Here is a sample cypher queries you can use to load the data from csv files. Make sure to comment sections or update as necessary if you need to before running it in Neo4J browser. You can typically visit localhost:7474 for accessing Neo4J browser if you using Neo4J docker container if you are using Neo4J Desktop you can open Neo4J browser from the UI. It should look something like this.

LOAD CSV WITH HEADERS FROM 'file:///buses.csv' AS bus_row 
WITH bus_row WHERE bus_row.name IS NOT NULL 
MERGE (bus:Bus {name: bus_row.name, longitude: toFloat(bus_row.longitude), 
latitude: toFloat(bus_row.latitude)});

LOAD CSV WITH HEADERS FROM 'file:///loads.csv' AS load_row 
WITH load_row WHERE load_row.name IS NOT NULL 
MERGE (load:Load {name: load_row.name, kw: toFloat(load_row.kw), 
kvar: toFloat(load_row.kvar), source: load_row.source, 
critical_load_factor: toFloat(load_row.critical_load_factor)})
MERGE (load_bus:Bus {name: load_row.source})
MERGE (load)-[:CONSUMES_POWER_FROM]->(load_bus);

LOAD CSV WITH HEADERS FROM 'file:///line_sections.csv' AS line_row 
WITH line_row WHERE line_row.name IS NOT NULL 
MERGE (from_bus:Bus {name: line_row.source})
MERGE (to_bus:Bus {name: line_row.target})
MERGE (from_bus)-[:CONNECTS_TO {name: line_row.name, source: line_row.source, 
target: line_row.target,ampacity: toFloat(line_row.ampacity),
height_m: toFloat(line_row.height_m), geom_type: line_row.geom_type}]->(to_bus);

LOAD CSV WITH HEADERS FROM 'file:///transformers.csv' AS xfmr_row
WITH xfmr_row WHERE xfmr_row.name IS NOT NULL 
MERGE (from_bus:Bus {name: xfmr_row.source})
MERGE (to_bus:Bus {name: xfmr_row.target})
MERGE (from_bus)-[:CONNECTS_TO {name: xfmr_row.name, source: xfmr_row.source, 
target: xfmr_row.target, kva: xfmr_row.kva, 
height_m: toFloat(xfmr_row.height_m)}]->(to_bus);


LOAD CSV WITH HEADERS FROM 'file:///pv_systems.csv' AS pv_row
WITH pv_row WHERE pv_row.name IS NOT NULL 
MERGE (sa:Solar {capacity: toFloat(pv_row.capacity), 
    name: pv_row.name, owner: pv_row.owner})
MERGE (ba:Bus {name: pv_row.bus})
MERGE (lo:Load {name: pv_row.owner})
MERGE (sa)-[:INJECTS_ACTIVE_POWER_TO]->(ba)
MERGE (lo)-[:OWNS]->(sa);

LOAD CSV WITH HEADERS FROM 'file:///energy_storage.csv' AS es_row
WITH es_row WHERE es_row.name IS NOT NULL 
MERGE (ea:EnergyStorage {kw: toFloat(es_row.kw), name: es_row.name, 
     owner:es_row.owner})
MERGE (ba:Bus {name: es_row.bus})
MERGE (lo:Load {name: es_row.owner})
MERGE (ea)-[:INJECTS_POWER]->(ba)
MERGE (ba)-[:CONSUMES_POWER]->(ea)
MERGE (lo)-[:OWNS]->(ea);

LOAD CSV WITH HEADERS FROM 'file:///substation.csv' AS sub_row
WITH sub_row WHERE sub_row.name IS NOT NULL 
MERGE (b:Bus {name: sub_row.name}) 
SET b:Substation;

MATCH (b:Bus)-[CONSUMES_POWER]-(c:Load)
SET c.longitude = b.longitude
SET c.latitude = b.latitude;

MATCH (c:Load)
WHERE c.latitude IS NULL
DETACH DELETE c;

LOAD CSV WITH HEADERS FROM 'file:///pharmacies.csv' AS p_row
WITH p_row WHERE p_row.name IS NOT NULL 
MERGE (p:Pharmacy {name: (p_row.name + p_row.gid), source:p_row.source, 
kw:toFloat(p_row.kw), kvar:toFloat(p_row.kvar), 
backup_capacity_kw:toFloat(p_row.backup_capacity_kw),
backup: toInteger(p_row.backup), 
longitude: toFloat(p_row.longitude), latitude: toFloat(p_row.latitude)})
WITH p
     MATCH (lo:Load)
     MERGE (lo)-[:VISITS_FOR_MEDICINE {distance: point.distance(
        point({longitude: p.longitude, latitude:p.latitude}),
        point({longitude: lo.longitude, latitude:lo.latitude})
     )}]->(p);

MATCH (p:Pharmacy)
WITH p 
    MATCH (b:Bus {name: p.source})
    MERGE (b)<-[:GETS_POWER_FROM]-(p);

LOAD CSV WITH HEADERS FROM 'file:///groceries.csv' AS g_row
WITH g_row WHERE g_row.name IS NOT NULL 
MERGE (g:Grocery {name: (g_row.name + g_row.gid), source:g_row.source, 
kw:toFloat(g_row.kw), kvar:toFloat(g_row.kvar), 
backup_capacity_kw:toFloat(g_row.backup_capacity_kw),
backup: toInteger(g_row.backup),
longitude: toFloat(g_row.longitude), latitude: toFloat(g_row.latitude)})
WITH g
    MATCH (lo:Load)
    MERGE (lo)-[:VISITS_FOR_GROCERIES {distance: point.distance(
       point({longitude: g.longitude, latitude:g.latitude}),
       point({longitude: lo.longitude, latitude:lo.latitude})
        )}]->(g);

MATCH (g:Grocery)
WITH g 
    MATCH (b:Bus {name: g.source})
    MERGE (b)<-[:GETS_POWER_FROM]-(g);

LOAD CSV WITH HEADERS FROM 'file:///medical_centers.csv' AS m_row
WITH m_row WHERE m_row.name IS NOT NULL 
MERGE (m:Hospital {name: (m_row.name + m_row.gid),
source:m_row.source, kw:toFloat(m_row.kw), kvar:toFloat(m_row.kvar), 
backup_capacity_kw:toFloat(m_row.backup_capacity_kw),
longitude: toFloat(m_row.longitude), 
latitude: toFloat(m_row.latitude), backup: toInteger(m_row.backup)})
WITH m
    MATCH (lo:Load)
    MERGE (lo)-[:VISITS_DURING_HEALTH_EMERGENCY {distance: point.distance(
       point({longitude: m.longitude, latitude:m.latitude}),
       point({longitude: lo.longitude, latitude:lo.latitude})
        )}]->(m);

MATCH (h:Hospital)
WITH h 
    MATCH (b:Bus {name: h.source})
    MERGE (b)<-[:GETS_POWER_FROM]-(h);

LOAD CSV WITH HEADERS FROM 'file:///banking.csv' AS b_row
WITH b_row WHERE b_row.name IS NOT NULL 
MERGE (b:Banking {name: (b_row.name + b_row.gid),
source: b_row.source,kw:toFloat(b_row.kw), kvar:toFloat(b_row.kvar), 
backup_capacity_kw:toFloat(b_row.backup_capacity_kw),
backup: toInteger(b_row.backup),
longitude: toFloat(b_row.longitude), latitude: toFloat(b_row.latitude)})
WITH b
    MATCH (lo:Load)
    MERGE (lo)-[:VISITS_TO_WITHDRAW_OR_DEPOSIT_CURRENCY {distance: point.distance(
       point({longitude: b.longitude, latitude:b.latitude}),
       point({longitude: lo.longitude, latitude:lo.latitude})
        )}]->(b);

MATCH (b1:Banking)
WITH b1
    MATCH (b:Bus {name: b1.source})
    MERGE (b)<-[:GETS_POWER_FROM]-(b1);

LOAD CSV WITH HEADERS FROM 'file:///convenience.csv' AS c_row
WITH c_row WHERE c_row.name IS NOT NULL 
MERGE (c:Convenience {name: (c_row.name + c_row.gid), 
source:c_row.source, kw:toFloat(c_row.kw), kvar:toFloat(c_row.kvar), 
backup_capacity_kw:toFloat(c_row.backup_capacity_kw),
backup: toInteger(c_row.backup),
longitude: toFloat(c_row.longitude), latitude: toFloat(c_row.latitude)})
WITH c
    MATCH (lo:Load)
    MERGE (lo)-[:VISITS_FOR_SERVICE {distance: point.distance(
       point({longitude: c.longitude, latitude:c.latitude}),
       point({longitude: lo.longitude, latitude:lo.latitude})
        )}]->(c);

MATCH (c:Convenience)
WITH c
    MATCH (b:Bus {name: c.source})
    MERGE (b)<-[:GETS_POWER_FROM]-(c);

LOAD CSV WITH HEADERS FROM 'file:///shelters.csv' AS s_row
WITH s_row WHERE s_row.use_type IS NOT NULL 
MERGE (s:Shelter {name: (s_row.use_type + s_row.gid), 
backup: toInteger(s_row.backup), source: s_row.source, kw:toFloat(s_row.kw), 
kvar:toFloat(s_row.kvar), 
backup_capacity_kw:toFloat(s_row.backup_capacity_kw),
longitude: toFloat(s_row.longitude), latitude: toFloat(s_row.latitude)})
WITH s
    MATCH (lo:Load)
    MERGE (lo)-[:VISITS_FOR_SERVICE {distance: point.distance(
       point({longitude: s.longitude, latitude:s.latitude}),
       point({longitude: lo.longitude, latitude:lo.latitude})
    )}]->(s);

MATCH (s:Shelter)
WITH s
    MATCH (b:Bus {name: s.source})
    MERGE (b)<-[:GETS_POWER_FROM]-(s);

MATCH (b1:Bus)-[r:CONNECTS_TO]-(b2:Bus)
SET r.longitude = (b1.longitude + b2.longitude)/2
SET r.latitude = (b1.latitude + b2.latitude)/2;

MATCH p=(sourceNode:Bus)-[r:CONNECTS_TO]-(targetNode:Bus)
WHERE r.ampacity IS NOT NULL
WITH r,sourceNode, CASE r.num_phase
WHEN 3 THEN 1.732
ELSE 1
END AS multiplier
SET r.kva = multiplier*r.ampacity*sourceNode.kv;

MATCH p=(sourceNode:Bus)-[r:CONSUMES_POWER_FROM]-(targetNode:Load)
SET targetNode.kw = toFloat(targetNode.kw)
SET targetNode.kvar = toFloat(targetNode.kvar)
SET r.kva = sqrt(targetNode.kw*targetNode.kw+targetNode.kvar*targetNode.kvar);

MATCH p=(sourceNode:Bus)-[r:INJECTS_ACTIVE_POWER_TO]-(targetNode:Solar)
SET targetNode.capacity = toFloat(targetNode.capacity)
SET r.kva = targetNode.capacity;


MATCH p=(sourceNode:Bus)-[r:INJECTS_POWER]-(targetNode:EnergyStorage)
SET targetNode.kw = toFloat(targetNode.kw)
SET r.kva = targetNode.kw;

MATCH (sourceNode:Bus)-[r:CONNECTS_TO]-(targetNode:Bus)
SET r.kva = toFloat(r.kva);

MATCH p=()-[r:GETS_POWER_FROM]->()
WHERE r.kva is null
SET r.kva = 300;

MATCH p=()-[r:CONNECTS_TO]->() 
WHERE r.kva IS null
SET r.kva = 300;