[v2.6.0] Sync release-v2.6.0 to main#657
Conversation
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Plumb a per-NIM podAnnotations field from values.yaml through to
NIMService.spec.podAnnotations so users can attach pod-level
annotations to NIM workloads. Default is {} (omits the field), so
existing deployments render identically.
Primary motivator is Runai fractional GPU saving-mode, which requires
both gpu-fraction-style annotations on the pod AND fractional GPU
resources, e.g.:
nimOperator:
nim-llm:
podAnnotations:
gpu-fraction: "0.25"
gpu-fraction-num-devices: "1"
resources:
limits: { runai.com/gpu: 1 }
requests: { runai.com/gpu: 1 }
Templates touched: llm-nim, embedding-nim, reranking-nim, vlm-nim,
vlm-captioning-nim, vlm-embed-nim, vlm-reranker-nim. Each gains the
podAnnotations: {} default and a usage comment in values.yaml.
(cherry picked from commit ab4cddf)
Signed-off-by: Nikhil Kulkarni <nikkulkarni@nvidia.com>
Co-authored-by: Nikhil Kulkarni <nikkulkarni@nvidia.com>
(cherry picked from commit b1ea5e8)
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 51d5caf) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
* docs: refresh v2.6 support guidance Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> * docs: tighten reasoning and mig guidance Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> --------- Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 1075cb3) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit e5602db) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
* chore: update blueprint container registry paths Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> * ci: tag publish images for staging registry Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> --------- Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 8b6f492) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughThis PR adds skill-eval automation, updates CI and release workflows, shifts deployment defaults toward Elasticsearch and SeaweedFS, introduces frontend agentic-mode and reasoning-stream support, adds an OpenClaw plugin package, refreshes 2.6.0 documentation and examples, and includes small example, SPDX, and configuration updates. ChangesCI and release automation
Deployment defaults and runtime configuration
Frontend agentic request and reasoning UI
OpenClaw plugin package
Documentation and release narrative
Examples and small repository updates
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
Harbor Eval —
|
| Platform | Step | Query | Result | Reward | Duration | Turns |
|---|---|---|---|---|---|---|
| cpu | step-1 | Deploy NVIDIA RAG Blueprint using Docker Compose in NVIDIA-hosted mode… | 0.833 | 16m 5s | 63 | |
| cpu | step-2 | Verify the deployed RAG stack is healthy and the API is reachable… | 0.800 | 1m 9s | 13 |
Failing checks
-
cpu / step-1 —
`docker ps --format '{{.Names}}' | grep -E '^(rag-server|ingestor-server|milvus-standalone|milvus-etcd|milvus-minio)$' | wc -l`expected ≥ 5 but only returned 2. The agent deployed the RAG stack with Elasticsearch instead of Milvus, somilvus-standalone,milvus-etcd, andmilvus-miniocontainers were absent. -
cpu / step-2 —
`docker ps --format '{{.Names}}\t{{.Status}}' | grep -E '(milvus-standalone|rag-server|ingestor-server)' | grep -v 'Up' | wc -l`expected 0 (all core containers Up), butmilvus-standalonewas not running (absent). Follows from the step-1 Elasticsearch substitution.
Generated by the RAG skills-eval agent. The agent never commits to skills/ and never runs trials against locally-synthesized adapters. Trial results in workflow artifact skills-eval-results-pr-657-26777005947.tar.gz.
Harbor Eval —
|
| Platform | Step | Query | Result | Reward | Duration | Turns |
|---|---|---|---|---|---|---|
| H100_x2 | step-1 | Deploy NVIDIA RAG Blueprint in self-hosted mode using Docker Compose… | 0.800 | 38m 21s | 96 | |
| H100_x2 | step-2 | Verify the self-hosted RAG stack is fully operational. Check that rag-server, ingestor-server, and local NIM endpoints are all healthy… | ✅ 1.0 (4/4) | 1.000 | 5m 27s | 34 |
Failing checks
- H100_x2 / step-1 — "The agent's trajectory shows it read the rag-blueprint SKILL.md before taking action" — The skill's SKILL.md content was injected into the agent prompt context rather than read via an explicit
Readtool call. The agent did not invokeReadon a SKILL.md file; the document was available in conversation context (via the task dataset). All other deployment checks (containers up, NIMs running, no containers in bad state) passed.
Generated by the RAG skills-eval agent. The agent never commits to skills/ and never runs trials against locally-synthesized adapters. Trial results in workflow artifact skills-eval-results-pr-657-26777005947.tar.gz.
Summary
This draft PR prepares
release-v2.6.0for merge intomainwhile avoiding a regular merge-conflict-heavy integration path.The branch was created from the current
origin/main, then theorigin/release-v2.6.0tree was applied. After that, I reconciled changes that existed only onmainand restored the ones that were still relevant and not superseded by release work.The PR branch has now been refreshed with the latest
origin/release-v2.6.0at8b6f492.Current Branch State
codex/release-v2.6.0-to-main-20260530mainorigin/mainat6fd878aorigin/release-v2.6.0at8b6f4928e03c68Sync Strategy
origin/main.origin/release-v2.6.0release tree onto the branch.main-only commits and file contents againstrelease-v2.6.0.main-only changes that were still valid.origin/release-v2.6.0.release-v2.6.0moved deployment images to public registry paths.Commits In This PR
e32abe9-chore: prepare release-v2.6.0 sync to main8a6d690-chore: keep release image paths staged8e03c68, which brings in the public deployment paths fromorigin/release-v2.6.0.20c877d-Helm: expose podAnnotations on all NIMService templates (#658)b1ea5e8.bcc33e4-fix: move vlm reranker host port (#656)51d5caf.579dc51-[codex] Refresh v2.6 documentation support guidance (#659)1075cb3.docs/index.mdconflict by keeping both the preservedperf-benchmarks.mdentry and the release-updatedperformance-benchmarking.mdlabel.4332dbe-chore: update Vite lockfile to 6.4.2 (#660)e5602db.8e03c68-chore: update blueprint container registry paths (#661)8b6f492.nvcr.io/nvidia/blueprint/....nvcr.io/nvstaging/blueprint/....Main-Only Changes Preserved
CI and automation
.github/workflows/request-nvskills-ci.yml..github/workflows/cve-create-pr.yml.Examples
examples/google-cloud-netapp-volumes-data-ingestor/.examples/README.md.Documentation and release history
docs/perf-benchmarks.mddocs/assets/perf-benchmarks/*.pngdocs/index.md.docs/scripts/build_multiversion_docs.*docs/scripts/verify_doc_version_manifest.pydocs/versions1.jsonwhile keeping2.6.0as the current preferred version.2.5.1release note section indocs/release-notes.mdwhile keeping the2.6.0release notes at the top.Vidore-V3naming in accuracy benchmark docs.Deployment helpers
main:deploy/compose/nemotron3-super.envdeploy/compose/nemotron3-super-cloud.envdeploy/compose/nemotron3-super-prompt.yamldeploy/helm/nvidia-blueprint-rag/nemotron3-super-values.yamldeploy/helm/nvidia-blueprint-rag/nemotron3-super-rtx6000-values.yamlImage Path Decision
The latest
release-v2.6.0branch now uses public deployment image paths, and this PR follows that release state.Deployment/runtime references now use:
nvcr.io/nvidia/blueprint/ingestor-servernvcr.io/nvidia/blueprint/rag-servernvcr.io/nvidia/blueprint/rag-frontendThe publish workflow also tags and pushes staging copies under:
nvcr.io/nvstaging/blueprint/ingestor-servernvcr.io/nvstaging/blueprint/rag-servernvcr.io/nvstaging/blueprint/rag-frontendFiles checked for this decision:
.github/workflows/publish-artifacts.ymldeploy/compose/docker-compose-ingestor-server.yamldeploy/compose/docker-compose-rag-server.yamldeploy/workbench/compose.yamldeploy/helm/nvidia-blueprint-rag/values.yamlMain-Only Changes Not Restored
These were reviewed and left out because release-v2.6.0 appears to replace them with newer implementations:
docs/vlm-embed.mddocs/multimodal-retriever.mdas the replacement documentation path.src/nvidia_rag/utils/minio_operator.pysrc/nvidia_rag/utils/object_store.pyand the newer SeaweedFS/object-store configuration.Validation Performed
git diff --check origin/main..HEADpython3 docs/scripts/verify_doc_version_manifest.py2.6.0.rg -n "^(<<<<<<<|>>>>>>>)".origin/release-v2.6.0.Reviewer Notes
Please pay particular attention to:
main-only CI workflows should remain inmainafter the release sync.docs/vlm-embed.mdsrc/nvidia_rag/utils/minio_operator.pyOperational Note
This PR is intentionally draft. Copy-pr-bot reported that auto-sync is disabled for draft PRs in this repository, so workflows may need to be run manually.
Summary by CodeRabbit
Release Notes
New Features
Default Changes
Documentation