Skip to content

How to model a database

e-footprint does not have a first-class abstraction for databases. A database is modeled by composing the primitives used for any backend workload: a Server, a Storage, and a set of Jobs. A database-flavoured Service, plugged in through the server's installed-services mechanism, would bundle a sensible engine baseline and canonical query templates — that would be a welcome contribution.

How a database maps onto the primitives

Python sketch

storage = Storage(
    "PostgreSQL storage",
    base_storage_need=SourceValue(100 * u.GB_stored),
)

db_server = Server(
    "PostgreSQL server",
    base_ram_consumption=SourceValue(2 * u.GB_ram),
    base_compute_consumption=SourceValue(0.1 * u.cpu_core),
    storage=storage,
    # remaining params fall back to defaults
)

read_query = Job(
    "SELECT", server=db_server,
    request_duration=SourceValue(20 * u.ms),
    compute_needed=SourceValue(0.1 * u.cpu_core),
    ram_needed=SourceValue(20 * u.MB_ram),
    data_transferred=SourceValue(5 * u.kB),
    data_stored=SourceValue(0 * u.kB),
)

write_query = Job(
    "INSERT", server=db_server,
    request_duration=SourceValue(40 * u.ms),
    compute_needed=SourceValue(0.2 * u.cpu_core),
    ram_needed=SourceValue(50 * u.MB_ram),
    data_transferred=SourceValue(2 * u.kB),
    data_stored=SourceValue(0.5 * u.kB),
)

The values above sketch a modest PostgreSQL deployment under steady load: a couple of gigabytes of shared buffers, continuous background workers (checkpointer, autovacuum), and queries in the millisecond range. They are illustrative, not measured — profile your own setup or cite a credible source in a Source.

Pitfalls

  • Averaging reads and writes into a single job. They differ by orders of magnitude on data_stored and often on compute_needed. Model the mix via multiple Job instances.
  • Confusing transfer and storage. Job.data_transferred is the per-call network payload; Job.data_stored is the per-call contribution to durable storage. Both can be non-zero on the same job.
  • Forgetting the engine baseline. The defaults for Server.base_ram_consumption and Server.base_compute_consumption are zero. A database with no queries still consumes resources.

See server_to_server_interaction for how to wire these jobs into a usage journey alongside an upstream web server.

Load this scenario in the e-footprint interface: Database modeling