Skip to content

[v2.6.0] Sync release-v2.6.0 to main#657

Open
shubhadeepd wants to merge 8 commits into
mainfrom
codex/release-v2.6.0-to-main-20260530
Open

[v2.6.0] Sync release-v2.6.0 to main#657
shubhadeepd wants to merge 8 commits into
mainfrom
codex/release-v2.6.0-to-main-20260530

Conversation

@shubhadeepd
Copy link
Copy Markdown
Collaborator

@shubhadeepd shubhadeepd commented May 29, 2026

Summary

This draft PR prepares release-v2.6.0 for merge into main while avoiding a regular merge-conflict-heavy integration path.

The branch was created from the current origin/main, then the origin/release-v2.6.0 tree was applied. After that, I reconciled changes that existed only on main and restored the ones that were still relevant and not superseded by release work.

The PR branch has now been refreshed with the latest origin/release-v2.6.0 at 8b6f492.

Current Branch State

  • PR branch: codex/release-v2.6.0-to-main-20260530
  • Target branch: main
  • Main baseline: origin/main at 6fd878a
  • Latest release baseline included: origin/release-v2.6.0 at 8b6f492
  • Current PR head: 8e03c68

Sync Strategy

  1. Started from latest origin/main.
  2. Applied the origin/release-v2.6.0 release tree onto the branch.
  3. Compared main-only commits and file contents against release-v2.6.0.
  4. Restored verified main-only changes that were still valid.
  5. Left out only changes that appeared superseded by release-v2.6.0 replacements.
  6. Periodically refreshed the branch with the newest release commits from origin/release-v2.6.0.
  7. Let the latest release branch image-path decision win once release-v2.6.0 moved deployment images to public registry paths.

Commits In This PR

  • e32abe9 - chore: prepare release-v2.6.0 sync to main
    • Applies the release-v2.6.0 tree onto main.
    • Restores selected main-only changes after content review.
  • 8a6d690 - chore: keep release image paths staged
    • Earlier follow-up that kept deployment image paths on staging while release still used staging paths.
    • Superseded by later release commit 8e03c68, which brings in the public deployment paths from origin/release-v2.6.0.
  • 20c877d - Helm: expose podAnnotations on all NIMService templates (#658)
    • Refreshes the PR branch with release commit b1ea5e8.
  • bcc33e4 - fix: move vlm reranker host port (#656)
    • Refreshes the PR branch with release commit 51d5caf.
  • 579dc51 - [codex] Refresh v2.6 documentation support guidance (#659)
    • Refreshes the PR branch with release commit 1075cb3.
    • Resolved one docs/index.md conflict by keeping both the preserved perf-benchmarks.md entry and the release-updated performance-benchmarking.md label.
  • 4332dbe - chore: update Vite lockfile to 6.4.2 (#660)
    • Refreshes the PR branch with release commit e5602db.
  • 8e03c68 - chore: update blueprint container registry paths (#661)
    • Refreshes the PR branch with release commit 8b6f492.
    • Moves deployment image references to nvcr.io/nvidia/blueprint/....
    • Adds publish workflow tagging/pushes for nvcr.io/nvstaging/blueprint/....

Main-Only Changes Preserved

CI and automation

  • Preserved .github/workflows/request-nvskills-ci.yml.
  • Preserved the CVE workflow rolling compare marker behavior in .github/workflows/cve-create-pr.yml.

Examples

  • Preserved the Google Cloud NetApp Volumes data ingestor example under examples/google-cloud-netapp-volumes-data-ingestor/.
  • Restored the corresponding entry in examples/README.md.

Documentation and release history

  • Preserved performance benchmark result documentation and assets:
    • docs/perf-benchmarks.md
    • docs/assets/perf-benchmarks/*.png
  • Restored the performance benchmark link and toctree entry in docs/index.md.
  • Preserved docs multiversion support scripts:
    • docs/scripts/build_multiversion_docs.*
    • docs/scripts/verify_doc_version_manifest.py
  • Preserved version history in docs/versions1.json while keeping 2.6.0 as the current preferred version.
  • Preserved the 2.5.1 release note section in docs/release-notes.md while keeping the 2.6.0 release notes at the top.
  • Preserved small documentation corrections:
    • Vidore-V3 naming in accuracy benchmark docs.
    • Brev URL correction in notebook docs.

Deployment helpers

  • Preserved standalone Nemotron 3 Super helper files from main:
    • deploy/compose/nemotron3-super.env
    • deploy/compose/nemotron3-super-cloud.env
    • deploy/compose/nemotron3-super-prompt.yaml
    • deploy/helm/nvidia-blueprint-rag/nemotron3-super-values.yaml
    • deploy/helm/nvidia-blueprint-rag/nemotron3-super-rtx6000-values.yaml

Image Path Decision

The latest release-v2.6.0 branch now uses public deployment image paths, and this PR follows that release state.

Deployment/runtime references now use:

  • nvcr.io/nvidia/blueprint/ingestor-server
  • nvcr.io/nvidia/blueprint/rag-server
  • nvcr.io/nvidia/blueprint/rag-frontend

The publish workflow also tags and pushes staging copies under:

  • nvcr.io/nvstaging/blueprint/ingestor-server
  • nvcr.io/nvstaging/blueprint/rag-server
  • nvcr.io/nvstaging/blueprint/rag-frontend

Files checked for this decision:

  • .github/workflows/publish-artifacts.yml
  • deploy/compose/docker-compose-ingestor-server.yaml
  • deploy/compose/docker-compose-rag-server.yaml
  • deploy/workbench/compose.yaml
  • deploy/helm/nvidia-blueprint-rag/values.yaml

Main-Only Changes Not Restored

These were reviewed and left out because release-v2.6.0 appears to replace them with newer implementations:

  • docs/vlm-embed.md
    • Not restored because release-v2.6.0 introduces docs/multimodal-retriever.md as the replacement documentation path.
  • src/nvidia_rag/utils/minio_operator.py
    • Not restored because release-v2.6.0 moves object storage handling to src/nvidia_rag/utils/object_store.py and the newer SeaweedFS/object-store configuration.

Validation Performed

  • git diff --check origin/main..HEAD
    • Passed.
  • python3 docs/scripts/verify_doc_version_manifest.py
    • Passed.
    • Confirmed docs project/version metadata for 2.6.0.
  • Conflict marker scan with rg -n "^(<<<<<<<|>>>>>>>)".
    • No unresolved conflict markers found.
  • Image path scan across deployment and publish files.
    • Confirmed public deployment image paths are present.
    • Confirmed staging publish tags are present where expected.
  • Diff check against latest origin/release-v2.6.0.
    • Remaining differences are the intentional preserved main-only overlays.

Reviewer Notes

Please pay particular attention to:

  • Whether the preserved main-only CI workflows should remain in main after the release sync.
  • Whether the Google Cloud NetApp Volumes example should ship with the final main state.
  • Whether the preserved Nemotron 3 Super helper files are still desired alongside release-v2.6.0 deployment docs.
  • Whether the two intentionally omitted files are correctly superseded:
    • docs/vlm-embed.md
    • src/nvidia_rag/utils/minio_operator.py
  • Whether the public deployment image paths plus staging publish tags match the intended release policy.

Operational Note

This PR is intentionally draft. Copy-pr-bot reported that auto-sync is disabled for draft PRs in this repository, so workflows may need to be run manually.

Summary by CodeRabbit

Release Notes

  • New Features

    • Agentic RAG pipeline with plan-and-execute flow and streaming reasoning traces in UI
    • OpenShift Helm deployment support
    • VLM Reranker and dedicated VLM Captioning services
    • Evaluation performance benchmarking and skill evaluation tools
  • Default Changes

    • Elasticsearch is now the default vector database (Milvus optional)
    • SeaweedFS replaces MinIO as the default object store
    • Nemotron 3 Super 120B is the default LLM
    • VLM embedding model for multimodal ingestion
  • Documentation

    • New guides for Agentic RAG, OpenShift deployment, and performance benchmarking
    • Updated multimodal retriever documentation

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 29, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
@shubhadeepd shubhadeepd self-assigned this May 29, 2026
@shubhadeepd shubhadeepd added documentation Improvements or additions to documentation enhancement New feature or request tests Security labels May 29, 2026
@shubhadeepd shubhadeepd changed the title [codex] Sync release-v2.6.0 to main [v2.6.0] Sync release-v2.6.0 to main May 29, 2026
shubhadeepd and others added 6 commits May 30, 2026 01:28
Plumb a per-NIM podAnnotations field from values.yaml through to
NIMService.spec.podAnnotations so users can attach pod-level
annotations to NIM workloads. Default is {} (omits the field), so
existing deployments render identically.

Primary motivator is Runai fractional GPU saving-mode, which requires
both gpu-fraction-style annotations on the pod AND fractional GPU
resources, e.g.:

  nimOperator:
    nim-llm:
      podAnnotations:
        gpu-fraction: "0.25"
        gpu-fraction-num-devices: "1"
      resources:
        limits:   { runai.com/gpu: 1 }
        requests: { runai.com/gpu: 1 }

Templates touched: llm-nim, embedding-nim, reranking-nim, vlm-nim,
vlm-captioning-nim, vlm-embed-nim, vlm-reranker-nim. Each gains the
podAnnotations: {} default and a usage comment in values.yaml.

(cherry picked from commit ab4cddf)

Signed-off-by: Nikhil Kulkarni <nikkulkarni@nvidia.com>
Co-authored-by: Nikhil Kulkarni <nikkulkarni@nvidia.com>
(cherry picked from commit b1ea5e8)
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
(cherry picked from commit 51d5caf)
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
* docs: refresh v2.6 support guidance

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

* docs: tighten reasoning and mig guidance

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

---------

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
(cherry picked from commit 1075cb3)
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
(cherry picked from commit e5602db)
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
* chore: update blueprint container registry paths

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

* ci: tag publish images for staging registry

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

---------

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
(cherry picked from commit 8b6f492)
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

This PR adds skill-eval automation, updates CI and release workflows, shifts deployment defaults toward Elasticsearch and SeaweedFS, introduces frontend agentic-mode and reasoning-stream support, adds an OpenClaw plugin package, refreshes 2.6.0 documentation and examples, and includes small example, SPDX, and configuration updates.

Changes

CI and release automation

Layer / File(s) Summary
Workflow defaults and review config
.coderabbit.yaml, .github/workflows/run-branch-script.yml
Code review automation settings and manual workflow defaults are updated.
Skill evaluation pipeline
.github/workflows/skills-eval.yml, .github/skill-eval/*, ci/run_skill_eval*.sh
A workflow, agent prompt, Python runner, and shell harness are added to run skill evaluations on PR, schedule, and manual triggers.
Main CI pipeline updates
.github/workflows/ci-pipeline.yml
CI uses newer actions, uv-based unit tests, Elasticsearch-first integration setup, Milvus-specific sequences, and expanded integration result handling.
Publishing and CVE automation
.github/workflows/publish-artifacts.yml, ci/post-cve-report.sh, ci/publish_wheel.sh
Publishing jobs add branch gating, retagging, semver normalization, and NGC cleanup, and a new script updates the nightly CVE tracker issue and patch comment.

Deployment defaults and runtime configuration

Layer / File(s) Summary
Compose and workbench defaults
deploy/compose/*, deploy/workbench/compose.yaml, deploy/workbench/quickstart.ipynb
Compose and workbench stacks switch defaults toward Elasticsearch, SeaweedFS, updated model endpoints, renamed OCR services, VLM services, and named persistent volumes.
Helm values and templates
deploy/helm/nvidia-blueprint-rag/*
Helm chart metadata, values, templates, NIM resources, Elasticsearch credential wiring, and OpenShift resources are updated for the 2.6.0 deployment layout.
Prompts, dashboards, and MIG layouts
deploy/helm/.../files/prompt.yaml, deploy/config/agentic-rag-metrics-dashboard.json, deploy/helm/mig-slicing/*
Prompt templates are revised, an agentic Grafana dashboard is added, and H100/RTX6000 MIG layouts and placement values are redefined.
Runtime support files
deploy/compose/vectordb.yaml, deploy/compose/seaweedfs-config/s3.json, deploy/compose/nemoguardrails/..., deploy/helm/.../files/sitecustomize.py
Vector store infrastructure, SeaweedFS config, guardrails streaming output, and nv-ingest CPU/Ray patching support are added or updated.

Frontend agentic request and reasoning UI

Layer / File(s) Summary
Agentic request contracts and settings
frontend/src/types/*, frontend/src/store/useSettingsStore.ts, frontend/src/hooks/useCitationUtils.ts, frontend/src/hooks/useMessageSubmit.ts
Frontend types and settings add explicit agentic mode, reasoning and metrics fields, citation stage formatting, and backend-aware filter compilation.
Streaming and chat UI
frontend/src/hooks/useChatStream.ts, frontend/src/components/chat/*
Streaming parsing now handles agentic events and reasoning traces, and chat UI adds a pipeline selector plus collapsible reasoning panels.
Frontend test coverage and polish
frontend/src/components/.../__tests__/*, frontend/src/hooks/__tests__/*, frontend/src/components/filtering/FilterGenerationToggle.tsx
Tests are added or updated for agentic mode, reasoning panels, streaming parsing, request cleanup, citation stage behavior, and related UI defaults.

OpenClaw plugin package

Layer / File(s) Summary
Plugin package and entrypoint
.openclaw/index.ts, .openclaw/openclaw.plugin.json, .openclaw/package.json, .openclaw/tsconfig.json, .openclaw/.gitignore
A new OpenClaw plugin package is added with registration logic for workspace setup, optional skill guidance, and gateway systemd patching.
Workspace templates and docs
.openclaw/README.md, .openclaw/workspace/*
Plugin setup documentation and workspace bootstrap, identity, agent, soul, and tools templates are added.

Documentation and release narrative

Layer / File(s) Summary
Release and navigation updates
README.md, docs/index.md, docs/readme.md, docs/release-notes.md, docs/migration_guide.md, docs/documentation.md, AGENTS.md, CLAUDE.md
Top-level docs, navigation, release notes, migration notes, and contributor guidance are updated for 2.6.0.
Agentic, API, and reasoning docs
docs/agentic-rag.md, docs/api_reference/*, docs/custom-metadata.md, docs/enable-nemotron-thinking.md, docs/observability.md, docs/performance-benchmarking.md
Documentation adds agentic RAG guidance, updates API schemas, documents Elasticsearch filter generation, revises reasoning behavior, and adds benchmarking guidance.
Deployment and operations docs
docs/change-vectordb.md, docs/elasticsearch-configuration.md, docs/deploy-*, docs/milvus-configuration.md, docs/nemoretriever-ocr.md, docs/troubleshooting.md, docs/text_only_ingest.md, docs/retrieval-only-deployment.md
Deployment, vector database, OCR, troubleshooting, and ingestion docs are rewritten around Elasticsearch, SeaweedFS, new service names, and OpenShift support.
Model and multimodal docs
docs/change-model.md, docs/multimodal-*, docs/vlm.md, docs/python-client.md, docs/model-profiles.md, docs/image_captioning.md
Model, multimodal retrieval, query, captioning, VLM, and client examples are updated for new defaults and service combinations.

Examples and small repository updates

Layer / File(s) Summary
Example behavior updates
examples/nvidia_rag_mcp/*, examples/rag_event_ingest/kafka_consumer/models/__init__.py
The MCP example updates streaming parsing and Elasticsearch endpoint examples, and the event-ingest example adds explicit model exports.
Header and minor housekeeping changes
examples/rag_event_ingest/..., examples/rag_react_agent/..., frontend/src/components/..., frontend/src/pages/__tests__/Chat.test.tsx
Many files receive SPDX header additions or small non-behavioral adjustments.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

  • NVIDIA-AI-Blueprints/rag#617 — adds ci/post-cve-report.sh, which updates the Nightly CVE Scan Tracker issue and related patch comment flow.

Possibly related PRs

Suggested reviewers

  • nv-pranjald

Poem

🐇 I hopped through configs, charts, and streams,
and stitched new thoughts into the beams.
Elasticsearch now hums along,
while agentic traces sing their song.
With docs and claws and tests in tow,
this burrow’s grown for 2.6.0.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/release-v2.6.0-to-main-20260530

@shubhadeepd shubhadeepd marked this pull request as ready for review June 1, 2026 19:28
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Harbor Eval — skill-source/.agents/skills/rag-blueprint/eval/nvidia_hosted.json

Head: 44ccc49 · spec e32abe9
First started: 2026-06-01T19:30:32Z · Last finished: 2026-06-01T19:48:25Z · Total: 17min 52s

Platform Step Query Result Reward Duration Turns
cpu step-1 Deploy NVIDIA RAG Blueprint using Docker Compose in NVIDIA-hosted mode… ⚠️ 0.833 (5/6) 0.833 16m 5s 63
cpu step-2 Verify the deployed RAG stack is healthy and the API is reachable… ⚠️ 0.800 (4/5) 0.800 1m 9s 13

Failing checks

  • cpu / step-1`docker ps --format '{{.Names}}' | grep -E '^(rag-server|ingestor-server|milvus-standalone|milvus-etcd|milvus-minio)$' | wc -l` expected ≥ 5 but only returned 2. The agent deployed the RAG stack with Elasticsearch instead of Milvus, so milvus-standalone, milvus-etcd, and milvus-minio containers were absent.

  • cpu / step-2`docker ps --format '{{.Names}}\t{{.Status}}' | grep -E '(milvus-standalone|rag-server|ingestor-server)' | grep -v 'Up' | wc -l` expected 0 (all core containers Up), but milvus-standalone was not running (absent). Follows from the step-1 Elasticsearch substitution.

Generated by the RAG skills-eval agent. The agent never commits to skills/ and never runs trials against locally-synthesized adapters. Trial results in workflow artifact skills-eval-results-pr-657-26777005947.tar.gz.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Harbor Eval — skill-source/.agents/skills/rag-blueprint/eval/h100.json

Head: 44ccc49 · spec e32abe9
First started: 2026-06-01T20:06:12Z · Last finished: 2026-06-01T20:50:30Z · Total: 44min 18s

Platform Step Query Result Reward Duration Turns
H100_x2 step-1 Deploy NVIDIA RAG Blueprint in self-hosted mode using Docker Compose… ⚠️ 0.800 (4/5) 0.800 38m 21s 96
H100_x2 step-2 Verify the self-hosted RAG stack is fully operational. Check that rag-server, ingestor-server, and local NIM endpoints are all healthy… ✅ 1.0 (4/4) 1.000 5m 27s 34

Failing checks

  • H100_x2 / step-1"The agent's trajectory shows it read the rag-blueprint SKILL.md before taking action" — The skill's SKILL.md content was injected into the agent prompt context rather than read via an explicit Read tool call. The agent did not invoke Read on a SKILL.md file; the document was available in conversation context (via the task dataset). All other deployment checks (containers up, NIMs running, no containers in bad state) passed.

Generated by the RAG skills-eval agent. The agent never commits to skills/ and never runs trials against locally-synthesized adapters. Trial results in workflow artifact skills-eval-results-pr-657-26777005947.tar.gz.

Copy link
Copy Markdown
Collaborator

@kheiss-uwzoo kheiss-uwzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request Security tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants