Skip to content

[deckhouse-cli] Plugins / d8 self-update with requirements check#386

Draft
Glitchy-Sheep wants to merge 31 commits into
mainfrom
feat/cli-and-plugins-autoupdate
Draft

[deckhouse-cli] Plugins / d8 self-update with requirements check#386
Glitchy-Sheep wants to merge 31 commits into
mainfrom
feat/cli-and-plugins-autoupdate

Conversation

@Glitchy-Sheep

Copy link
Copy Markdown
Contributor

Summary

Adds two ways to keep d8 up to date:

  • d8 cli command tree that updates the d8 binary itself
  • reworked d8 plugins that installs, updates, and runs plugins.

Both reach the cluster's registry through the in-cluster registry-packages-proxy, authenticated by the user's kubeconfig identity - no registry credentials are handed out.

Plugin operations validate the plugin's declared requirements before anything is downloaded or switched.

New commands

d8 cli - update the d8 binary:

  • check - report whether a newer version than the current one is published.
  • versions (alias list) - list published versions newest-first; the current one is starred, locally installed ones are marked.
  • update [--version X] - install a version into the store and point the binary at it.
  • use <version> - switch to a version: an installed one is an instant offline symlink repoint, a missing one is downloaded first.

d8 plugins - manage plugins (standalone binaries that run as native-looking subcommands):

  • install <name> [--version X] [--use-major N] [--force] - install or switch a plugin version.
  • update <name> / update all - move to the newest cluster-compatible version within the installed major.
  • versions <name> - list a plugin's published versions.
  • list [--installed|--available] - list plugins.
  • contract <name> - show a plugin's contract.
  • remove <name> - remove an installed plugin.
  • d8 <plugin> ... (wrapper, with DECKHOUSE_PLUGINS_ENABLED=true) - run an installed plugin, auto-installing it on first use.

Requirements validation

A plugin declares its requirements in a contract, and d8 enforces them:

  • Cluster-side - Kubernetes, Deckhouse, and module versions (mandatory / conditional / anyOf) are checked against a one-shot cluster snapshot. The cluster is only queried when the plugin actually declares such a requirement.
  • Plugin-to-plugin - conflicts with already installed plugins; --resolve-plugins-conflicts tries to pull in missing dependencies.
  • When - validated before any download or symlink switch (install, update, and the fast path that only repoints to an already installed version), and again before every plugin run (in case of k8s/module changed).
  • Escape hatch - --skip-cluster-checks / D8_PLUGINS_SKIP_CLUSTER_CHECKS=1 for testing purposes

Version selection picks the newest stable version whose cluster-side requirements are satisfied, stays within the installed major unless --use-major is given, and never downgrades implicitly.

How updates are applied

Both d8 cli and d8 plugins share the same atomic scheme:

  1. download into a staged file while the live binary keeps working.
  2. smoke-test it (a corrupt or wrong-platform artifact is rejected before it replaces anything)
  3. back up the previous binary, then atomically swap and repoint a current symlink.
  4. A failure at any step leaves the previous version in place.

Example usage

Self-update (d8 cli):

# What is published, and what am I on?
$ d8 cli versions
  v1.60.0  newer
* v1.59.0  current
  v1.58.2

# Is there anything newer?
$ d8 cli check
A newer deckhouse-cli is available: v1.60.0 (current: v1.59.0). Run 'd8 cli update' to upgrade.

# Update to the latest stable
$ d8 cli update
Updating deckhouse-cli to v1.60.0...
deckhouse-cli updated to v1.60.0.
The d8 binary in PATH is now a symlink into the version store; the previous binary is kept with a ".old" suffix.

# Roll back instantly and offline - the previous version is still in the store
$ d8 cli use v1.59.0
Switched deckhouse-cli to v1.59.0 (installed locally).
Previous version v1.60.0 remains installed - switch back with 'd8 cli use v1.60.0'.

Plugins (d8 plugins):

# Inspect before installing
$ d8 plugins versions stronghold
  v1.4.0  newer
  v1.3.0

$ d8 plugins contract stronghold
name: stronghold
version: v1.4.0
description: Vault-fork integration for DKP
requirements:
  kubernetes:
    constraint: '>=1.27'

# Install - the cluster requirements are checked first, then the binary is pulled
$ d8 plugins install stronghold
Installing plugin: stronghold
Tag: v1.4.0
Downloading and extracting plugin...
✓ Plugin 'stronghold' successfully installed!

# Run it as if it were a native subcommand (DECKHOUSE_PLUGINS_ENABLED=true)
$ d8 stronghold status
...plugin output...

# Already current - update is a no-op
$ d8 plugins update stronghold
Updating plugin: stronghold
Plugin 'stronghold' is already at v1.4.0, nothing to do (use --force to reinstall).

$ d8 plugins remove stronghold
✓ Plugin 'stronghold' successfully removed!

A plugin whose requirements the cluster does not meet is rejected before anything is downloaded or switched:

$ d8 plugins install stronghold
Installing plugin: stronghold
Error executing command: failed to validate requirements: kubernetes requirement: \
  plugin stronghold requires Kubernetes >=1.27, but the cluster runs v1.24.3
# exit code 1, nothing installed, current symlink untouched

Refactor

  • Dropped the direct-registry plugin service (pkg/registry/service/plugin_service.go and its test, ~1280 lines) so the registry proxy is the single plugin source. The contract decoder it carried is kept as pkg/registry/service/contract.go for reuse.

New packages

  • internal/rpp - HTTP client for the registry proxy: kubeconfig-bearer transport (no redirects, so the token never leaves the proxy host), endpoint discovery (public Ingress, with pod-IP fallback), and tar.gz extraction.
  • internal/lockfile - an exclusive file lock shared by plugin installs and self-update so two switches cannot run at once.
  • internal/selfupdate - the self-update machinery and the per-user version store with its current symlink.
  • internal/plugins/requirements - the one-shot cluster snapshot (k8s / Deckhouse / modules) and the named checks run against it.

Tests

  • internal/rpp - transport, endpoint discovery, image pull, and tar.gz extraction (transport_test.go, endpoint_test.go, image_test.go, extract_test.go).
  • internal/selfupdate - version store and symlink switching, the updater, the RPP source, and an end-to-end update against a fake proxy (store_test.go, update_test.go, switch_test.go, integration_test.go).
  • internal/plugins - install pipeline, version selection, requirement checks and cluster snapshot, the run wrapper, and an RPP end-to-end path (install_test.go, select_test.go, requirements/checks_test.go, requirements/clusterstate_test.go, run_test.go, rpp_e2e_test.go).
  • internal/lockfile - exclusive locking and stale-lock reclaim (lockfile_test.go).

- exclusive O_EXCL lock with identity-checked stale reclaim
- shared lock dialect for plugin install and self-update

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
…, TLS)

- HTTP client over the proxy /v1/images routes with kubeconfig bearer identity
- TLS hardening: CA bundle / insecure flag, redirect ban, size limits, timeouts

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- discover the proxy endpoint from the cluster (public Ingress preferred, pod-IP fallback)
- cluster client wiring and safe gzip/tar extraction with decompression-bomb limits

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- table-driven tests for client / transport / endpoint discovery / extraction over an httptest TLS proxy
- drop a stale package-doc line about an HTTP HEAD stat the client no longer does

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- export UnmarshalContract so the rpp plugin source decodes contracts with the same user-actionable errors
- correct the module-requirements doc: mandatory/conditional check "enabled", not merely "in the cluster"

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
…tract decoder

- remove PluginService and its OCI methods (contract-from-annotation, image extract, tag/catalog listing); the registry-packages-proxy is the only plugin source
- keep the contract decoder and DTO<->domain converters in contract.go, reused by the rpp source, validators and install

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- on-disk layout helpers, the registry-packages-proxy flag set, the PluginSource interface and the Manager struct
- foundation the install/update/run/list machinery is built on; rpp is the only plugin source

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- one-shot ClusterState snapshot (Kubernetes/Deckhouse/module versions) and the ordered named checks against it
- distinguishes a genuinely unmet requirement (selection may retry older) from an operational error that must propagate

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
…tion

- validate a plugin contract (plugin-to-plugin deps + cluster-side checks) before install/run; read the cached contract
- resolve missing mandatory plugin deps when asked, with the cluster snapshot cached per command run

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- pick the newest STABLE version whose cluster-side requirements hold, walking newest-to-oldest with a contract cache
- a genuinely incompatible version demotes to an older one; an unreachable cluster or broken contract hard-stops

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- lock, staged download, smoke-test and atomic swap; validate requirements before any switch
- relink-only path for an already-installed version; downgrade guard for the implicit/background update

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- adapts the registry-packages-proxy client to PluginSource: list tags, read the YAML contract, extract the binary
- tolerates contract-less images; in-process HTTPS e2e covering list -> contract -> extract

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- RunInstalled: lazy install, contract requirement gate before run, env injection, SIGTERM grace on cancel
- local help/version/completion args bypass the cluster gate so a plugin stays usable offline

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
…emove

- InitPluginServices builds the registry-packages-proxy client from the kubeconfig identity - the only plugin source
- list installed plugins, published versions, update-all within major (home-fallback aware), remove / remove-all

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- detached, throttled (6h TTL) spawn of the visible `d8 plugins update all`; never blocks the command, fails closed on marker write
- skipped when disabled by env, on Windows, within the TTL, or with nothing installed; home-fallback aware

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- move the machinery into internal/plugins; cmd/ keeps only cobra wiring (list/versions/contract/install/update/remove + per-plugin wrapper)
- drop the old in-cmd impl (validators, init, layout, flags); add versions command and cmd-level tests

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- per-user store (~/.deckhouse-cli/cli/versions/<tag>/d8) selected by a `current` symlink; switching is an atomic repoint, no sudo
- staged install with smoke-test before an entry becomes visible; immutable entries; nil-safe best-effort reads

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- Updater: list/select latest stable, stage+smoke into the store, atomic current-symlink switch under a stale-reclaiming lock
- migrate a plain-file install to the symlink layout (backup as <exe>.old, rollback on failure); rpp source normalizes per-platform tags

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- print a one-line "newer version available" notice from a per-user cache; refreshed synchronously by the root hook, at most once per 24h TTL
- best-effort and silent (missing cache, non-semver dev build, disabled via env); correct the package doc to describe the refresh as synchronous

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- check/update/versions(list)/use over the registry-packages-proxy; use switches to a stored version offline, downloads otherwise
- cron prints a copy-pasteable crontab line (d8 never edits crontab itself); RefreshNoticeCache feeds the root-hook notice

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- register `d8 plugins` and `d8 cli`; ExecuteC resolves the run command so the post-command hook gates by resolved name, not os.Args
- recursion-safe background hook: synchronous self-update notice + detached plugin auto-update, skipped for cli/plugins/help/completion

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- plugins.md / self-update.md user guides and package READMEs: rpp-only sourcing, the store/symlink layout, requirements, auto-update
- accurate TTLs (plugins 6h, notice 24h), the synchronous notice refresh, and the independent per-mechanism disable switches

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- platform service streams images_digests.json from the registry instead of pulling the installer to an OCI layout; installer/security/modules stage blob-less scaffolding
- refresh the expected output, artifact table, call tree and test references to match

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
…onfig in the root hook

- the flag-less root hook passed an empty kubeconfig path, but SetupK8sClientSet does not fall back to $KUBECONFIG / ~/.kube/config on "", so the notice refresh always failed with "no updater" and the notice never appeared
- resolve the default path ($KUBECONFIG, else ~/.kube/config) in newDefaultUpdater; explicit `d8 cli ...` commands were unaffected (they pass the flag value)

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
- d8 no longer does anything in the background after a command: no plugin auto-update, no self-update notice, no detached cache refresh.
- Automation will be redesigned in a separate PR; only the manual commands stay for now.
- `d8 cli` keeps `check`/`update`/`use`/`versions`, `d8 plugins` its install/update/remove set - all run only when invoked.
- Drop the `d8 cli cron` helper, the `internal/bgproc` spawner, and the root-command gate that only guarded the background work.
- Strip the `D8_DISABLE_*` switches and the auto-update sections from docs and READMEs.

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
@Glitchy-Sheep Glitchy-Sheep self-assigned this Jun 15, 2026
@Glitchy-Sheep Glitchy-Sheep added the enhancement New feature or request label Jun 15, 2026
'd8 plugins list --available' listed the registry catalog through the direct-registry
plugin service. That service is gone (RPP is the only source now), and the proxy has no
catalog endpoint, so the listing could never return anything.

Drop the dead path: ListPlugins, fetchAvailablePlugins, the --available/--installed flags.
'd8 plugins list' now shows installed plugins plus a hint to install by name.

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
PullImage returned a Descriptor (manifest digest + size) that every caller
discarded and nothing ever read - idempotency is version-based, not digest-based,
and there is no digest-verification path. Return just the body stream and re-add
the digest when artifact verification actually lands.

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
The comments and README bullets promised a future per-artifact digest check, but a
digest served by the same proxy proves nothing - real authenticity needs a publisher
signature. Remove the notes; the trust model (TLS + kubeconfig identity, smoke test)
stands without them.

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
…dater

The plugin install/update and self-update comments justified their guards by an
'unattended background update' that was removed in e9446bd. Reword them to the
manual reality (implicit update / Ctrl-C'd install); no code change.

Signed-off-by: Roman Berezkin <roman.berezkin@flant.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant