Bulk Operations with PlexosDB¶
This guide demonstrates how to efficiently perform bulk operations using PlexosDB, which can significantly improve performance when working with large datasets.
Bulk Inserting Properties¶
When you need to add multiple properties to multiple objects, using individual add_property
calls can be inefficient. The add_properties_from_records
method provides a much more efficient approach.
Basic Usage¶
from plexosdb import PlexosDB
from plexosdb.enums import ClassEnum, CollectionEnum
# Initialize the database
db = PlexosDB()
db.create_schema()
# Create the objects first
db.add_object(ClassEnum.Generator, "Generator1")
db.add_object(ClassEnum.Generator, "Generator2")
db.add_object(ClassEnum.Generator, "Generator3")
# Prepare the property records
records = [
{
"name": "Generator1",
"Max Capacity": 100.0,
"Min Stable Level": 20.0,
"Heat Rate": 10.5
},
{
"name": "Generator2",
"Max Capacity": 150.0,
"Min Stable Level": 30.0,
"Heat Rate": 9.8
},
{
"name": "Generator3",
"Max Capacity": 200.0,
"Min Stable Level": 40.0,
"Heat Rate": 8.7
}
]
# Add all properties in bulk
db.add_properties_from_records(
records,
object_class=ClassEnum.Generator,
collection=CollectionEnum.Generators,
scenario="Base Case"
)
Performance Considerations¶
The add_properties_from_records
method processes records in batches (default 10,000 records per batch) and uses SQLite transactions to maximize performance. This makes it much faster than individual property insertions, especially for large datasets.
Key performance features:
Single transaction for all insertions (atomic operations)
Batch processing to control memory usage
Direct SQL execution with prepared statements
Automatic property enablement (sets
is_dynamic
andis_enabled
flags)
Handling Different Object Classes¶
You can process different types of objects separately:
# Add Generator objects
db.add_object(ClassEnum.Generator, "Generator1")
db.add_object(ClassEnum.Generator, "Generator2")
# Add Region objects
db.add_object(ClassEnum.Region, "Region1")
db.add_object(ClassEnum.Region, "Region2")
# Generator properties
generator_records = [
{"name": "Generator1", "Max Capacity": 100.0},
{"name": "Generator2", "Max Capacity": 150.0}
]
# Region properties
region_records = [
{"name": "Region1", "Load Scaling Factor": 1.1},
{"name": "Region2", "Load Scaling Factor": 0.9}
]
# Process each set with appropriate parameters
db.add_properties_from_records(
generator_records,
object_class=ClassEnum.Generator,
collection=CollectionEnum.Generators,
scenario="Base Case"
)
db.add_properties_from_records(
region_records,
object_class=ClassEnum.Region,
collection=CollectionEnum.Regions,
scenario="Base Case"
)
Data Validation¶
The method automatically validates:
All objects exist before attempting inserts
All property names are valid for the collection
All required fields are present
Bulk Inserting Memberships¶
Creating relationships between many objects can be time-consuming when done individually. The add_memberships_from_records
method allows you to efficiently create multiple memberships in a single operation.
Basic Usage¶
from plexosdb import PlexosDB
from plexosdb.enums import ClassEnum, CollectionEnum
from plexosdb.utils import create_membership_record
# Initialize the database
db = PlexosDB()
db.create_schema()
# Create parent and child objects
region_id = db.add_object(ClassEnum.Region, "MainRegion")
# Create multiple nodes
node_ids = []
for i in range(1, 101): # Create 100 nodes
node_ids.append(db.add_object(ClassEnum.Node, f"Node{i}"))
# Get necessary IDs for memberships
parent_class_id = db.get_class_id(ClassEnum.Region)
child_class_id = db.get_class_id(ClassEnum.Node)
collection_id = db.get_collection_id(
CollectionEnum.ReferenceNode,
parent_class_enum=ClassEnum.Region,
child_class_enum=ClassEnum.Node
)
# Create membership records
membership_records = create_membership_record(
node_ids,
child_object_class_id=child_class_id,
parent_object_class_id=parent_class_id,
parent_object_id=region_id,
collection_id=collection_id
)
# Bulk insert all memberships at once
db.add_memberships_from_records(membership_records)
To identify the correct CollectionEnum
for your relationship, use the list_collections
method:
collection_list = db.list_collections(parent_class=ClassEnum.Region, child_class=ClassEnum.Node)
print(collection_list) # Shows available collections for Region-Node relationships
This ensures you’re using the exact collection name that exists in your database schema.
Performance Benefits¶
Using add_memberships_from_records
offers several advantages over individual add_membership
calls:
Significantly reduced execution time for large datasets
Lower overhead from fewer database operations
Optional chunking for very large datasets (controlled by the
chunksize
parameter)Efficient batch SQL execution
Manual Record Creation¶
If you need more control, you can manually create the membership records:
# Create records manually
records = []
for node_id in node_ids:
records.append({
'parent_class_id': parent_class_id,
'parent_object_id': region_id,
'collection_id': collection_id,
'child_class_id': child_class_id,
'child_object_id': node_id
})
# Bulk insert memberships
db.add_memberships_from_records(records)
Each record must contain these fields:
parent_class_id
: ID of the parent classparent_object_id
: ID of the parent objectcollection_id
: ID of the collectionchild_class_id
: ID of the child classchild_object_id
: ID of the child object
Combined Bulk Operations¶
For complex model creation, you can combine bulk operations to efficiently build your model:
First create all objects using
add_objects
(bulk object creation)Add memberships between objects with
add_memberships_from_records
Add properties to the objects using
add_properties_from_records
This approach can dramatically improve performance when creating large, complex models.