Storing Outputs#
This documentation describes the functionality of the output processing system used to store simulation outputs either in a database or as CSV files. It covers the configuration, data buffering, conversion, and storage mechanisms of the main class as well as the support class for database maintenance.
WriteOutput Class#
The WriteOutput class is responsible for:
Collecting simulation output messages.
Buffering data until either a time-based interval or a memory limit is reached.
Converting raw data into structured pandas DataFrames.
Storing data into a database and/or exporting it as CSV files.
Managing performance trade-offs by allowing configuration of buffer sizes and storage intervals.
This class is part of the assume.common.outputs module. For more details, refer to assume.common.outputs.WriteOutput.
Data Processing Workflow#
Message Handling:
The method
handle_output_messagereceives asynchronous messages containing simulation outputs.Data is buffered in lists corresponding to the intended database tables based on content type.
Buffer Size Management:
The memory usage of buffered data is monitored continuously.
If memory usage exceeds the
outputs_buffer_size_mbthreshold, buffered data is flushed to storage.
Data Conversion:
Helper methods like
convert_rl_paramsandconvert_market_resultstransform raw data into DataFrames suitable for storage.
Data Storage:
The method
store_dfsflushes buffered DataFrames to storage based on time intervals (save_frequency_hours) or buffer size limits.
Performance Considerations#
Buffering Strategy: Increasing
outputs_buffer_size_mbcan improve performance by reducing write frequency. Default: 300 MB.Write Intervals: Setting
save_frequency_hoursfacilitates real-time observation with Grafana dashboards. Disabling this by setting it tonullimproves performance but prevents real-time monitoring.Optimal Configuration: Increase
outputs_buffer_size_mband disablesave_frequency_hoursfor maximum performance. Ensure sufficient memory is available.
Note
When storing results as CSV files, save_frequency is automatically disabled, meaning data is only saved at the end of the simulation or if the buffer size limit is reached. This prevents real-time observation through Grafana dashboards.
Database Tables Overview#
The simulation outputs are stored in various tables. Below is a breakdown of tables, their purpose, and what columns they contain. When the outputs are stored in CSV files, the data is organized in a similar structure, with each table represented as a separate CSV file.
Table Name |
Purpose |
Columns |
|---|---|---|
market_meta |
Market-related data. |
simulation, market_id, time, node, product_start, product_end, only_hours, price, max_price, min_price, supply_volume, supply_volume_energy, demand_volume, demand_volume_energy |
market_dispatch |
Dispatch data for markets. |
simulation, market_id, datetime, unit_id, power |
unit_dispatch |
Unit-level dispatch data. |
simulation, unit, power, heat, soc, energy_generation_costs, energy_cashflow, total_costs |
market_orders |
Market order details. |
simulation, market_id, node, bid_id, unit_id, parent_bid_id, bid_type, end_time, price, volume, accepted_price, accepted_volume, min_acceptance_ratio |
power_plant_meta |
Power plant metadata. |
simulation, unit_id, unit_type (“power_plant”), max_power, min_power, emission_factor, efficiency |
storage_meta |
Storage unit metadata. |
simulation, unit_id, unit_type (“storage”), capacity, max_soc, min_soc, max_power_charge, max_power_discharge, min_power_charge, min_power_discharge, efficiency_charge, efficiency_discharge |
rl_params |
Reinforcement learning parameters. |
simulation, unit, datetime, evaluation_mode, episode, profit, reward, regret, actions, exploration_noise, critic_loss, total_grad_norm, max_grad_norm, learning_rate |
rl_meta |
Reinforcement learning metadata. |
simulation, episode, eval_episode, learning_mode, evaluation_mode |
Note
The columns for all unit_meta like power_plant_meta and storage_meta are not listed here. Their structure is dictated by the as_dict method in the respective unit classes.
DatabaseMaintenance Class#
The DatabaseMaintenance class provides utility functions for managing simulation data stored in a database. It includes:
Retrieving unique simulation IDs from the database.
Deleting specific simulations.
Performing bulk deletion of simulation data with optional exclusions.
This class is part of the assume.common.outputs module. For more details, refer to assume.common.outputs.DatabaseMaintenance.
Usage Example#
Below is an example YAML configuration snippet for a simulation configuring the save frequency and buffer size:
example_simulation:
simulation_id: example_simulation
start: 2025-01-01 00:00:00
end: 2025-01-02 00:00:00
save_frequency_hours: 24 # Time interval for saving data in hours
outputs_buffer_size_mb: 300 # Buffer size in MB
Example usage of the DatabaseMaintenance class:
from assume.common import DatabaseMaintenance
# Select to store the simulation results in a local database or in TimescaleDB.
# When using TimescaleDB, ensure Docker is installed and the Grafana dashboard is accessible.
data_format = "timescale" # Options: "local_db" or "timescale"
if data_format == "local_db":
db_uri = "sqlite:///./examples/local_db/assume_db.db"
elif data_format == "timescale":
db_uri = "postgresql://assume:assume@localhost:5432/assume"
maintenance = DatabaseMaintenance(db_uri=db_uri)
# 1. Retrieve unique simulation IDs:
unique_ids = maintenance.get_unique_simulation_ids()
print("Unique simulation IDs:", unique_ids)
# 2. Delete specific simulations:
maintenance.delete_simulations(["example_01", "example_02"])
# 3. Delete all simulations except a few:
maintenance.delete_all_simulations(exclude=["example_01"])