Skip to content

Controller never deletes Temporal-side Worker Deployment version records during normal rollout (only on CRD deletion) #377

Description

@noamyehudai

Summary

During a normal rollout, when a Worker Deployment Version drains, the controller deletes that version's Kubernetes Deployment (and the HPA rendered from the WorkerResourceTemplate), but it never deletes the Temporal-side Worker Deployment Version record. The only call to DeleteWorkerDeploymentVersion is in the CRD-deletion finalizer (added in #240), which runs only when the whole WorkerDeployment custom resource is deleted, not when an individual version drains during a rollout.

As a result, Temporal-side version records accumulate by one per deploy, and the controller relies entirely on server-side GC to keep the count under matching.maxVersionsInDeployment.

Why this is a problem

In practice the server's at-cap reclamation does not keep up. We see deployments accumulate 100+ drained versions that are never reclaimed (companion issue: temporalio/temporal#10737). Once a deployment reaches the cap, a new build cannot register as a poller and the rollout silently wedges: a merged change ships to a fleet still running the old version.

Note on existing issues

#270 ("version retention policy: keep last N after drain") was closed pointing at the Kubernetes sunset config (scaledownDelay / deleteDelay). But sunset only deletes the Kubernetes Deployment, not the Temporal-side version record, so it does not bound server-side version growth. The two cleanups are independent.

Request

An opt-in, controller-side prune of drained versions (for example a minVersionsToKeep retention policy that deletes drained, non-current versions beyond the newest N), or at minimum documentation that operators must externally prune drained versions to avoid hitting maxVersionsInDeployment. Happy to contribute a PR if there is interest.

Version

v1.7.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions