Job Scheduler and Workers

Much of what Dendra does happens outside interactive API requests. Image variants are generated after a photo upload. Vendor integrations pull data on a schedule. Import a file or generate a large download. Notifications and orchestration steps fire in the background.

Release 3 needed one coherent way to plan, run, and observe this work. Release 2 spread it across several patterns — standalone Node worker processes, Moleculer service nodes, Bull queues, and ad hoc scripts — each with its own lifecycle and operational quirks. The Dendra Job Scheduler and Dendra workers are Release 3’s unified approach: a small set of Go executables, shared protobuf contracts, and predictable scheduling modes tailored to how Dendra actually runs background tasks.

For where these components sit in the wider platform, see System Design. For RPC message types, see the Job API.

Why a Dedicated Job Scheduler

Background work in Dendra falls into a few patterns:

Pattern	Examples
Fire-and-forget	Generate image variants after a file upload, enqueue a one-time sync
Long-running loop	Keep an integration orchestrator alive, subscribe to a change feed
On a timer	Poll a vendor API every few minutes, run a recurring extract
Serialized queue	Run jobs one at a time per queue so they do not race, discover tables

A single generic “run this function later” queue is not enough. Continuous jobs need restart semantics and graceful stop. Recurring jobs need interval timing separate from per-run timeouts. Queued jobs need FIFO ordering within a named queue. Release 3 encodes these as first-class scheduling modes on the scheduler rather than bolting behavior onto each worker.

Workers stay focused on doing the work. The scheduler owns when and how often a job runs, tracks run state, and coordinates with workers over NATS.

Core Concepts

Concept	Role
Job	A scheduled unit of work with an id, a job type, a spec (parameters), and a scheduling mode
Job type	Named handler code registered in a worker (e.g. `GeneratePhotoVariantsToMinio`, `OrchestrateIntegration`)
Job scheduler	Standalone executable in `dendra-api-services` that registers jobs, dispatches runs, and exposes Connect-RPC for management
Worker process	Standalone executable in `dendra-job-workers` that connects to NATS, advertises its job types, and executes runs when told
Worker type	Namespace prefix for a worker binary (`backend`, `extract`, `load`). The scheduler learns which worker type handles each job type from worker status events
Job provider	Source that supplies job definitions to the scheduler at startup (and on reconciliation)

A worker process is a host. A job type is the actual task implementation. Many job types can live in one worker binary when they share dependencies and scaling needs; a new binary is added when isolation helps (native libraries, memory, deploy cadence, or pipeline usage).

Scheduling Modes

Each job selects exactly one mode through its options:

Mode	Behavior	Typical use
One-shot	Runs once, then completes	Image variant generation triggered by an upload
Continuous	Runs repeatedly on an interval until stopped; honors per-run timeouts and restart	Integration orchestration, change-feed subscription
Recurring	Runs on a fixed interval (`recur_every`) with a per-run timeout	Scheduled vendor extracts (e.g. HOBO time-frame pulls)
Queued	One-shot jobs executed one at a time per named queue	Work that must not overlap within a queue

The scheduler enforces timeouts, reschedules according to the mode, and records run history. Job-type code should match the mode it is deployed with: one-shots finish promptly; continuous jobs must respond to stop signals; queued jobs assume no concurrent runs in the same queue.

How a Job Runs

At a high level, work flows from configuration through the scheduler to workers and back:

---
config:
  flowchart:
    nodeSpacing: 40
    rankSpacing: 60
    padding: 16
---
flowchart TB
    subgraph Sources["Job sources"]
        TOML["jobs.toml<br/>(TOML provider)"]
        Integ["Integration registrations<br/>(Platform API)"]
        RPC["Scheduler RPC<br/>(create / manage jobs)"]
    end

    subgraph Scheduler["Job scheduler"]
        Providers["Job providers"]
        Controllers["Job controllers<br/>(one per job)"]
        RPCSvc["Connect-RPC service"]
    end

    subgraph Messaging["Worker messaging (NATS)"]
        Bus["StartJob · StopJob · status events"]
    end

    subgraph Workers["Worker processes"]
        Backend["backend worker"]
        Extract["extract worker"]
        Load["load worker"]
    end

    subgraph Work["Job types"]
        Img["Image variants"]
        Orch["Integration orchestration"]
        Ext["Vendor extracts · prepare"]
        LoadJ["Influx load"]
        Sync["Metadata sync"]
    end

    subgraph Platform["Platform & external"]
        API["API Server"]
        Stores["Object stores · databases · mail · vendors"]
    end

    TOML --> Providers
    Integ --> Providers
    RPC --> RPCSvc
    Providers --> Controllers
    RPCSvc --> Controllers

    Controllers <-->|"dispatch & status"| Bus
    Bus <-->|"queue group"| Backend
    Bus <-->|"queue group"| Extract
    Bus <-->|"queue group"| Load

    Backend --> Img
    Backend --> Orch
    Backend --> Sync
    Extract --> Ext
    Load --> LoadJ

    Img --> API
    Orch --> API
    Sync --> API
    Ext --> API
    Backend --> Stores
    Extract --> Stores
    Load --> Stores

Registration — Job providers load job definitions (from jobs.toml, integration registrations, or RPC). Each job gets a controller in the scheduler.
Discovery — Workers connect to NATS, publish Online/Offline events, and announce which job types they support. The scheduler maps job types to worker types.
Dispatch — When a job is due, the scheduler sends StartJob over NATS. Workers in a queue group compete fairly for work.
Execution — The matching job-type handler runs, calling the API server and external services as needed. Workers publish job status heartbeats.
Completion or reschedule — On finish, timeout, or stop, the scheduler updates state and, depending on the mode, schedules the next run or dequeues the next queued job.

Where Jobs Come From

The scheduler does not hard-code every job. Job providers supply definitions:

Provider	Source	Purpose
TOML file	`jobs.toml` in api-services	Preconfigured recurring and continuous system jobs — including change-feed subscribers, location-scoped data-pipeline loaders, and archive jobs
Integration service	Platform API `ListIntegrationJobRegistrations`	Jobs derived from live integration configs — when a user configures an integration, the corresponding orchestration and extract jobs appear automatically
Job API	API clients	Create, inspect, and manage jobs at runtime

On startup (and after a reconciliation window), the scheduler loads from all providers and starts a controller per job.

Current Workers and Job Types

Release 3 ships three worker binaries today. More job types and additional worker binaries are expected as the platform grows.

Backend Worker (`dendra-backend-worker`)

General platform tasks — image processing, integration orchestration, table discovery, metadata reverse sync, and platform side effects.

Job type	Area
`GenerateLogoAvatarVariantsToMinio`	Image variants
`GeneratePhotoVariantsToMinio`	Image variants
`DiscoverAndSyncTables`	Integration orchestration
`OrchestrateIntegration`	Integration orchestration
`SideEffectsSubscribeToChanges`	Platform side effects
`HandleOrganizationTeardown`	Platform side effects
`ReverseSyncSubscribeToChanges`	Migration sync
`ReverseSyncVocabularyMetadata`	Migration sync
`ReverseSyncOrganizationMetadata`	Migration sync
`ReverseSyncSiteMetadata`	Migration sync
`ReverseSyncStationMetadata`	Migration sync
`ReverseSyncThingTypeMetadata`	Migration sync
`ReverseSyncDatastreamMetadata`	Migration sync

Extract Worker (`dendra-extract-worker`)

Vendor data pulls and pipeline prepare jobs.

Job type	Area
`ExtractCampbellScientificLdmp`	Campbell Scientific LDMP
`PrepareCampbellScientificLdmp`	Campbell Scientific LDMP
`ExtractHoboTimeFrame`	LI-COR HOBOlink

Load Worker (`dendra-load-worker`)

Location-scoped pipeline loading into timeseries databases.

Job type	Area
`LiveLoadInfluxV1`	Data loading

Pipeline jobs from `jobs.toml`

The TOML job provider also registers continuous jobs for the integration data pipeline. These are declared in jobs.toml rather than per organization in the API:

Job type	How it is configured	Role
`LiveLoadInfluxV1`	One entry per location in `[jobs.live_load_influx_v1]`	Consumes location-scoped live prep subjects; batch-writes to Influx 1.x mirrors
`PostToArchive`	Typically a single system job (e.g. `main`)	Archives raw pipeline messages across all integrations

See Integration Orchestration for how pipeline subjects and locations fit together.

When adding a job type, implement it under jobtypes/ in an existing worker (or a new package), register it in that worker’s registry, and ensure the scheduler knows the job spec and mode. When adding a worker binary, create a new package with its own main.go and Makefile targets — the shared lib/go/jobworker library handles NATS registration and run lifecycle.

Releases Compared

Release 2	Release 3
Moleculer nodes + Bull queues + standalone scripts	Job scheduler + worker executables
Multiple Node worker flavors per integration	Job types grouped into worker binaries by dependency and scale
Ad hoc scheduling in each service	Four explicit scheduling modes on the scheduler
Mixed RPC and internal conventions	Shared Job API protobufs on the Buf Schema Registry

The shift keeps the number of deployable binaries small (see Evolving the Technology Stack) while making background work observable and manageable through one RPC surface.