Integration Orchestration

Automated ingestion has always been a core part of Dendra. What changed in Release 3 is who can set it up and how the platform keeps it running.

In Release 2, integration settings and transformation logic lived in GitHub (dendra-worker-state and related worker task packages). Dedicated worker processes watched for repository changes and reconfigured themselves. Spinning up a new Campbell LDMP connection, a HOBOlink pull, or a webhook endpoint typically required a backend engineer to edit state documents, deploy workers, and wire NATS subjects by hand.

Release 3 moves integration configuration into the Platform API and the Management interface. Users create integration configs through Configure Integrations. Then a long-running orchestrator job per config listens for changes, reconciles the desired state against what is running, and stands up the ETL (extract/transform/load) pipeline automatically.

For background job mechanics, see Job Scheduler and Workers.

Integration Types and Configs

Every integration has a type — the kind of vendor connection — and one or more configs — your organization’s actual setup for that type.

	Integration type	Integration config
What it is	Template for a vendor integration	A configured instance of that type
Defined by	Deploy config (`api.toml` in api-services)	User or API client via Configure Integrations
Direction & scope	Inbound or outbound; organization or station	Inherits from the type
Settings	Required variables and their types	Actual values — strings, secrets, references to other configs
Runtime	Declares job types the orchestrator runs when enabled	Enabled flag and operation mode (`test` or `live`) for pipeline routing
Example	Campbell Scientific Station	One LoggerNet station linked to an LDMP org config, with table names

See the Integrations guide for supported types.

Organization-scoped configs (e.g. LDMP server credentials) are often referenced by station-scoped configs (e.g. a specific LoggerNet station and table list). The orchestrator treats dependencies and dependents as part of the operational picture — a change to a parent LDMP config can require child station integrations to reconcile.

The Orchestrator

Each enabled integration config gets a continuous job of type OrchestrateIntegration. The API server creates this job when the config is first saved, and the job scheduler keeps it running.

The orchestration job:

Subscribes to change events for integration configs in the organization.
Performs a reconcile process (on each signal and at startup), which loads the config and related configs from the API.
Compares an operational fingerprint (a hash of state and settings) to the last applied fingerprint stored in integration checkpoints.
When the fingerprint changes or on first enable: provisions integration infrastructure and child jobs (extract, transform, etc.).
When the config is disabled or deleted: tears down child jobs and infrastructure, then retires the orchestrator job.

This follows a familiar control-loop pattern: desired state in the API, observed state in running jobs, reconcile until they match. Cosmetic fields (e.g. name, description) do not trigger redeploys.

The data pipeline itself — the shared JetStream stream that carries messages between stages — is system infrastructure. The backend worker ensures that stream exists when it starts (see below). The integration orchestrator focuses on vendor-specific child jobs for each config.

---
config:
  flowchart:
    nodeSpacing: 40
    rankSpacing: 60
    padding: 16
---
flowchart TB
    User["User or API client"]
    Manage["Management interface"]
    API["Integration service<br/>(Platform API)"]
    Events["Change events<br/>(NATS)"]
    Sched["Job scheduler"]
    Orch["OrchestrateIntegration<br/>(continuous job)"]
    Child["Child jobs<br/>(extract · transform · …)"]
    Pipeline["Data pipeline<br/>(JetStream subjects)"]
    Load["Load · archive · preview"]

    User --> Manage
    User --> API
    Manage --> API
    API -->|"create config + orchestrator job"| Sched
    API --> Events
    Events --> Orch
    Sched --> Orch
    Orch -->|"read config · checkpoints"| API
    Orch -->|"create/cancel child jobs"| Sched
    Sched --> Child
    Child --> Pipeline
    Pipeline --> Load
    Load --> API

Child Jobs and the Data Pipeline

After the orchestrator job starts, child jobs take over the ETL work — talking to the vendor system, reshaping records, and passing them along. Each integration type declares which job types it needs (extract, transform, and so on).

Records move through the data pipeline in stages. Jobs hand off messages by publishing to JetStream subjects — named channels scoped to the integration, its operation mode, and the processing stage:

Stage	What it is	Where it goes next
raw	Data as pulled from the vendor	Archived; then on to transform
decode	Vendor format decoded (when the type requires it)	Prepare
prep	Standardized and UTC-normalized, ready to use	Test: preview via API Live: location-scoped prep for loading (see below)

The last segment of each subject name marks whether that stage succeeded or failed:

Ending	Meaning
`.ok`	Stage finished successfully — downstream jobs can consume the message
`.err`	Stage failed — logged for alerting or retry

Test and live modes use separate branches of the pipeline, so sample data can be inspected before it reaches production tables.

Not every integration uses every stage — a simpler pipeline may go straight from raw to prep. Release 2 followed the same staged-stream pattern over NATS. Release 3 wires the subjects and jobs automatically when you save an integration config.

Runtime Status and Monitoring

The orchestrator job reports runtime phase and log entries back through the Integration service (ReportIntegrationRuntime). Phases include pending, starting, running, idle (disabled), draining, error, and retired. Checkpoints store operational progress — including the applied fingerprint and extract cursors — so jobs can resume without reprocessing entire histories.

In the Management interface, users can navigate to Monitor Integrations (Tools > Data Setup > Monitor Integrations) to monitor and troubleshoot data integrations and their pipelines.

Manual Setup vs. Orchestration

Here is a quick side-by-side of how integration setup changed from Release 2 to Release 3:

Concern	Release 2	Release 3
Where settings live	GitHub state repo + worker env	Integration configs via Platform API
Who can add an integration	Backend engineer (typically)	Org admins/curators through the management interface or API
How workers learn config	State repo watchers, manual deploy	Orchestrator job reconcile loop on API change events
Pipeline setup	Ops scripts, worker-specific wiring	Orchestrator creates integration child jobs
Observability	Logs, NATS tooling	Integration runtime API + Monitor Integrations page
Test vs. live data	Varied per integration	First-class `operation_mode` on each integration config