Skip to content

ADR: service process directive for long-running auxiliary tasks#7147

Open
edmundmiller wants to merge 3 commits into
masterfrom
claude/nextflow-service-process-76K0l
Open

ADR: service process directive for long-running auxiliary tasks#7147
edmundmiller wants to merge 3 commits into
masterfrom
claude/nextflow-service-process-76K0l

Conversation

@edmundmiller
Copy link
Copy Markdown
Member

Summary

This ADR documents the design and rationale for introducing a service process directive to Nextflow, enabling long-running auxiliary tasks (such as inference servers, embedded databases, or local HTTP fixtures) to expose resources to downstream consumer processes while still running.

Overview

The service true directive marks a process as a long-running auxiliary task whose declared outputs are emitted to downstream channels as soon as they appear in the task work directory, rather than waiting for task completion. The service is automatically terminated (SIGTERM) once all consumer processes have finished.

Key Design Decisions

  • Process-level directive: Uses a simple service true directive fitting Nextflow's existing directive vocabulary, rather than per-output markers
  • Output binding while running: Implements a background watcher that polls the work directory and emits outputs as soon as declared paths appear
  • Automatic lifecycle management: Tracks consumer processes via the workflow DAG and terminates the service when all consumers finish
  • Local executor only (MVP): Initial implementation restricted to the local executor to ensure shared filesystem and SIGTERM reachability
  • Cache disabled: Service tasks force cache false since outputs are transient resources (sockets, ports, ramdisks)

Implementation Approach

  • Reuses existing Nextflow infrastructure: TaskHandler.kill(), TraceObserverV2, and collectOutputs() machinery
  • Introduces ServiceLifecycleObserver to track service-consumer relationships and trigger termination
  • Adds TaskProcessor.startServiceOutputWatcher() for polling-based output detection
  • Validates at startup that service true is only used with the local executor

Use Cases Addressed

  • GPU inference microservices (NVIDIA NIM, vLLM, Ollama)
  • Embedded analytical databases as servers (DuckDB with quack extension)
  • Traditional databases for pipeline scratch state (PostgreSQL, Redis, ChromaDB)
  • Named pipes, ramdisks, and socket-based message buses
  • Local HTTP fixtures (localstack, minio, mock APIs)

Future Work

The ADR sketches extensibility paths for:

  • Grid schedulers (Slurm, LSF, SGE, PBS) with cluster-network or sub-allocation strategies
  • Kubernetes with native Service resources and readiness probes
  • Cloud batch platforms with object-storage endpoint propagation
  • Cross-cutting enhancements: service.colocate directive, group-local services, readiness probes, multi-instance services, and graceful drain

Non-Goals (MVP)

  • Cluster-wide or cross-node co-location
  • Group-local service instances
  • Networked service discovery
  • Per-output service(...) markers

https://claude.ai/code/session_01CVy7Bt2VZBZ2gXm7d8psVZ

claude added 3 commits May 14, 2026 20:19
Proposes a process-level `service true` directive that marks a process as a
long-running auxiliary task whose declared outputs are bound to downstream
channels as soon as they appear in the work directory, and which is
SIGTERM'd by a TraceObserverV2 once all consumer processes terminate.

MVP scope is the local executor only; cluster/cloud variants are out of
scope and called out as non-goals.

Signed-off-by: Claude <noreply@anthropic.com>
Adds concrete use cases that motivate the directive:
- NVIDIA NIM inference server (amortize model load across consumer tasks)
- DuckDB `quack` extension (client-server access to an embedded DB)

Also expands the Problem Statement bullet list with the GPU-inference
and embedded-database-as-server cases.

Signed-off-by: Claude <noreply@anthropic.com>
Sketches how `service true` would extend to grid schedulers, Kubernetes,
and cloud batch executors, along with cross-cutting follow-ups
(colocate strategy, group-local services, readiness probes,
multi-instance services, graceful drain).

Reclassifies the cluster/cloud non-goals as MVP-only rather than
permanent.

Signed-off-by: Claude <noreply@anthropic.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented May 14, 2026

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit bd359d0
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a0632d265d6110008b01b6f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants