Scale and Storage Management
During the rollout of Release 2, a deliberate deployment decision was made: all metadata would live in a single MongoDB store, while each organization received its own dedicated InfluxDB instance for time-series data. This provided strong data separation between organizations.
In practice, organizations varied widely in size and data volume, and InfluxDB’s memory consumption grows with the number of unique time series (cardinality). The separation made it possible to run smaller, independent InfluxDB instances which are easier to manage and scale on a Kubernetes cluster rather than a few very large ones.
While the Release 2 API handled this multi-instance setup effectively behind the scenes, it created a usability issue for end users. Users were occasionally exposed to infrastructure details — such as “which server” their data lived on. However, from the user perspective, they simply work with the tables produced by their data loggers. They should not have to think about storage servers, sharding decisions, or capacity planning.
Separating Concerns
Section titled “Separating Concerns”Release 3 addresses this usability issue by separating user-facing tables from storage layout. Users interact with logical tables that mirror logger output. The where and how data is stored for a given time period is determined automatically by the system, and not exposed to users.
That configuration lives in api.toml. It defines shardspaces that map organizations (or groups of organizations) to storage locations and time ranges, plus table storage rulesets
that describe physical database and table patterns for read (match.*) and write (store.*).
A shardspace holds one or more time-bounded shards. Each shard names a location; at deploy time, that location is wired to a physical timeseries database.
---
config:
flowchart:
nodeSpacing: 40
rankSpacing: 60
padding: 16
---
flowchart TB
subgraph OrgSS["Shardspace · organization"]
direction TB
Shard1["Shard 1<br/>2020 – 2023"]
Shard2["Shard 2<br/>2023 – present"]
end
Loc1["location · historical"]
Loc2["location · current"]
DB1[(Any DB<br/>dedicated instance)]
DB2[(InfluxDB<br/>shared instance)]
Shard1 --> Loc1
Shard2 --> Loc2
Loc1 --> DB1
Loc2 --> DB2
Here is a simplified excerpt showing shared and dedicated storage side by side:
# Table storage — physical database/table patterns## match.* — read/discovery (regex); store.* — write (template tokens)# "common": organizations share one InfluxDB; databases are namespaced per org[table_storage_rulesets.common.organization]match.database = "org_{organization.id}"match.table = "tab_(?P<id>\\w+)"store.database = "org_{organization.id}"store.table = "tab_{params.table_id}"
[table_storage_rulesets.common.station]match.database = "org_{organization.id}"match.table = "sta_{station.id}_tab_(?P<id>\\w+)"store.database = "org_{organization.id}"store.table = "sta_{station.id}_tab_{params.table_id}"
# "dedicated": organization has its own InfluxDB instance[table_storage_rulesets.dedicated.organization]match.database = "org"match.table = "tab_(?P<id>\\w+)"store.database = "org"store.table = "tab_{params.table_id}"
[table_storage_rulesets.dedicated.station]match.database = "sta_{station.id}"match.table = "tab_(?P<id>\\w+)"store.database = "sta_{station.id}"store.table = "tab_{params.table_id}"
# Shardspaces — which storage location serves data for a given time range
# Co-located organizations on shared storage[[shardspaces.common.shard]]begins_at = "0001-01-01T00:00:00Z"ends_before = "9999-12-31T23:59:59.999Z"location = "experimental"table_storage_ruleset = "common"
# Organization with a dedicated Release 2 instance — data stays in place[[shardspaces.organization.6092b070492ae15e05876ed8.shard]]begins_at = "0001-01-01T00:00:00Z"ends_before = "9999-12-31T23:59:59.999Z"location = "cdfw"table_storage_ruleset = "dedicated"Configuring Storage for Release 3
Section titled “Configuring Storage for Release 3”Utilizing the shardspace configuration:
- Organizations on dedicated Release 2 instances can move to Release 3 with shard rules that point at an existing
locationanddedicatedruleset — no bulk copy of historical data required. - Organizations on shared storage can use the
commonruleset, which namespaces databases per organization inside one InfluxDB instance. - Operators can add a timeseries location and shard rules in
api.tomlwithout changing how users refer to tables.
Storage topology stays in configuration, and users continue to work with tables and datastreams.