Skip to content

Background

In 2016, a discussion on social media about time-series databases kicked off a major evolution of the Berkeley Sensor Database. The result was Dendra — now in its third major iteration (Release 3) after two previous distinct releases.

Release 1 focused on modern dashboards, migration to InfluxDB, and rebuilding the existing Campbell Scientific LoggerNet data ingestion pipeline. It was built on the Observations Data Model (ODM) 1.1 with one key innovation: the introduction of the datastream concept.

To allow the new dashboards to work with both the existing MySQL data and the new InfluxDB store without requiring a large-scale migration of historical data, Release 1 introduced an API abstraction layer. This layer stored metadata in a NoSQL database and dynamically queried raw datapoints from both backends. It enabled stitching of time-series data from multiple stores within a single datastream in real time.

Furthermore, the abstraction layer made a gradual, low-risk migration possible. The new InfluxDB ingestion pipeline could be introduced alongside the legacy MySQL system, allowing MySQL to be phased out over time while historical data was backfilled at a comfortable pace.

From this foundation emerged the Datapoints Config, a dynamic view definition scoped to a specific timeframe. Each datastream can have multiple configs that specify where its data resides (MySQL or InfluxDB).

flowchart TB
    DS["Datastream<br/>one measurement at one site"]

    subgraph ViewDef["View definition (metadata)"]
        direction TB

        subgraph Configs["Datapoints Configs — one per timeframe"]
            direction LR
            C1["Config A"]
            C2["Config B"]
            C3["Config C"]
        end

        subgraph Stores["Raw datapoint stores"]
            direction LR
            MySQL[(MySQL)]
            Influx[(InfluxDB)]
        end

        C1 -. "defines source" .-> MySQL
        C2 -. "defines source" .-> Influx
        C3 -. "defines source" .-> Influx
    end

    subgraph API["At query time"]
        Meta[("Metadata<br/>NoSQL")]
        Query["Query & stitch<br/>in real time"]
    end

    DS --> Configs
    Meta --> Query
    Configs -->|"selects store<br/>per timeframe"| Query
    Query -->|"queries"| MySQL
    Query -->|"queries"| Influx

During the development of Release 1, four key design principles emerged that continue to guide Dendra’s architecture:

  1. Preserve the original data structure — Represent the source data layout and field names in the database as much as possible. This makes data series easy to locate and simplifies troubleshooting.

  2. Never modify raw values — Store original measurements unchanged. This ensures the system can always return the exact original data to users when requested.

  3. Unify around a single timestamp mechanism — Store every datapoint using a UTC timestamp as the primary key. This meets time-series database conventions, provides strong query performance, and supports loading or reloading data in arbitrary order without conflicts.

  4. Transform on read, not on write — Any conversions, calculations, or corrections are applied at query time through the API. Mathematical operations are computationally cheap, while database writes are significantly more expensive. This approach also avoids the need to version or reprocess large volumes of historical data when adjustments are required.