[v2.6.0] Sync release-v2.6.0 to main by shubhadeepd · Pull Request #657 · NVIDIA-AI-Blueprints/rag

shubhadeepd · 2026-05-29T18:53:38Z

Summary

This draft PR prepares release-v2.6.0 for merge into main while avoiding a regular merge-conflict-heavy integration path.

The branch was created from the current origin/main, then the origin/release-v2.6.0 tree was applied. After that, I reconciled changes that existed only on main and restored the ones that were still relevant and not superseded by release work.

The PR branch has now been refreshed with the latest origin/release-v2.6.0 at 8b6f492.

Current Branch State

PR branch: codex/release-v2.6.0-to-main-20260530
Target branch: main
Main baseline: origin/main at 6fd878a
Latest release baseline included: origin/release-v2.6.0 at 8b6f492
Current PR head: 8e03c68

Sync Strategy

Started from latest origin/main.
Applied the origin/release-v2.6.0 release tree onto the branch.
Compared main-only commits and file contents against release-v2.6.0.
Restored verified main-only changes that were still valid.
Left out only changes that appeared superseded by release-v2.6.0 replacements.
Periodically refreshed the branch with the newest release commits from origin/release-v2.6.0.
Let the latest release branch image-path decision win once release-v2.6.0 moved deployment images to public registry paths.

Commits In This PR

e32abe9 - chore: prepare release-v2.6.0 sync to main
- Applies the release-v2.6.0 tree onto main.
- Restores selected main-only changes after content review.
8a6d690 - chore: keep release image paths staged
- Earlier follow-up that kept deployment image paths on staging while release still used staging paths.
- Superseded by later release commit 8e03c68, which brings in the public deployment paths from origin/release-v2.6.0.
20c877d - Helm: expose podAnnotations on all NIMService templates (#658)
- Refreshes the PR branch with release commit b1ea5e8.
bcc33e4 - fix: move vlm reranker host port (#656)
- Refreshes the PR branch with release commit 51d5caf.
579dc51 - [codex] Refresh v2.6 documentation support guidance (#659)
- Refreshes the PR branch with release commit 1075cb3.
- Resolved one docs/index.md conflict by keeping both the preserved perf-benchmarks.md entry and the release-updated performance-benchmarking.md label.
4332dbe - chore: update Vite lockfile to 6.4.2 (#660)
- Refreshes the PR branch with release commit e5602db.
8e03c68 - chore: update blueprint container registry paths (#661)
- Refreshes the PR branch with release commit 8b6f492.
- Moves deployment image references to nvcr.io/nvidia/blueprint/....
- Adds publish workflow tagging/pushes for nvcr.io/nvstaging/blueprint/....

Main-Only Changes Preserved

CI and automation

Preserved .github/workflows/request-nvskills-ci.yml.
Preserved the CVE workflow rolling compare marker behavior in .github/workflows/cve-create-pr.yml.

Examples

Preserved the Google Cloud NetApp Volumes data ingestor example under examples/google-cloud-netapp-volumes-data-ingestor/.
Restored the corresponding entry in examples/README.md.

Documentation and release history

Preserved performance benchmark result documentation and assets:
- docs/perf-benchmarks.md
- docs/assets/perf-benchmarks/*.png
Restored the performance benchmark link and toctree entry in docs/index.md.
Preserved docs multiversion support scripts:
- docs/scripts/build_multiversion_docs.*
- docs/scripts/verify_doc_version_manifest.py
Preserved version history in docs/versions1.json while keeping 2.6.0 as the current preferred version.
Preserved the 2.5.1 release note section in docs/release-notes.md while keeping the 2.6.0 release notes at the top.
Preserved small documentation corrections:
- Vidore-V3 naming in accuracy benchmark docs.
- Brev URL correction in notebook docs.

Deployment helpers

Preserved standalone Nemotron 3 Super helper files from main:
- deploy/compose/nemotron3-super.env
- deploy/compose/nemotron3-super-cloud.env
- deploy/compose/nemotron3-super-prompt.yaml
- deploy/helm/nvidia-blueprint-rag/nemotron3-super-values.yaml
- deploy/helm/nvidia-blueprint-rag/nemotron3-super-rtx6000-values.yaml

Image Path Decision

The latest release-v2.6.0 branch now uses public deployment image paths, and this PR follows that release state.

Deployment/runtime references now use:

nvcr.io/nvidia/blueprint/ingestor-server
nvcr.io/nvidia/blueprint/rag-server
nvcr.io/nvidia/blueprint/rag-frontend

The publish workflow also tags and pushes staging copies under:

nvcr.io/nvstaging/blueprint/ingestor-server
nvcr.io/nvstaging/blueprint/rag-server
nvcr.io/nvstaging/blueprint/rag-frontend

Files checked for this decision:

.github/workflows/publish-artifacts.yml
deploy/compose/docker-compose-ingestor-server.yaml
deploy/compose/docker-compose-rag-server.yaml
deploy/workbench/compose.yaml
deploy/helm/nvidia-blueprint-rag/values.yaml

Main-Only Changes Not Restored

These were reviewed and left out because release-v2.6.0 appears to replace them with newer implementations:

docs/vlm-embed.md
- Not restored because release-v2.6.0 introduces docs/multimodal-retriever.md as the replacement documentation path.
src/nvidia_rag/utils/minio_operator.py
- Not restored because release-v2.6.0 moves object storage handling to src/nvidia_rag/utils/object_store.py and the newer SeaweedFS/object-store configuration.

Validation Performed

git diff --check origin/main..HEAD
- Passed.
python3 docs/scripts/verify_doc_version_manifest.py
- Passed.
- Confirmed docs project/version metadata for 2.6.0.
Conflict marker scan with rg -n "^(<<<<<<<|>>>>>>>)".
- No unresolved conflict markers found.
Image path scan across deployment and publish files.
- Confirmed public deployment image paths are present.
- Confirmed staging publish tags are present where expected.
Diff check against latest origin/release-v2.6.0.
- Remaining differences are the intentional preserved main-only overlays.

Reviewer Notes

Please pay particular attention to:

Whether the preserved main-only CI workflows should remain in main after the release sync.
Whether the Google Cloud NetApp Volumes example should ship with the final main state.
Whether the preserved Nemotron 3 Super helper files are still desired alongside release-v2.6.0 deployment docs.
Whether the two intentionally omitted files are correctly superseded:
- docs/vlm-embed.md
- src/nvidia_rag/utils/minio_operator.py
Whether the public deployment image paths plus staging publish tags match the intended release policy.

Operational Note

This PR is intentionally draft. Copy-pr-bot reported that auto-sync is disabled for draft PRs in this repository, so workflows may need to be run manually.

Summary by CodeRabbit

Release Notes

New Features
- Agentic RAG pipeline with plan-and-execute flow and streaming reasoning traces in UI
- OpenShift Helm deployment support
- VLM Reranker and dedicated VLM Captioning services
- Evaluation performance benchmarking and skill evaluation tools
Default Changes
- Elasticsearch is now the default vector database (Milvus optional)
- SeaweedFS replaces MinIO as the default object store
- Nemotron 3 Super 120B is the default LLM
- VLM embedding model for multimodal ingestion
Documentation
- New guides for Agentic RAG, OpenShift deployment, and performance benchmarking
- Updated multimodal retriever documentation

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

copy-pr-bot · 2026-05-29T18:53:41Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

Plumb a per-NIM podAnnotations field from values.yaml through to NIMService.spec.podAnnotations so users can attach pod-level annotations to NIM workloads. Default is {} (omits the field), so existing deployments render identically. Primary motivator is Runai fractional GPU saving-mode, which requires both gpu-fraction-style annotations on the pod AND fractional GPU resources, e.g.: nimOperator: nim-llm: podAnnotations: gpu-fraction: "0.25" gpu-fraction-num-devices: "1" resources: limits: { runai.com/gpu: 1 } requests: { runai.com/gpu: 1 } Templates touched: llm-nim, embedding-nim, reranking-nim, vlm-nim, vlm-captioning-nim, vlm-embed-nim, vlm-reranker-nim. Each gains the podAnnotations: {} default and a usage comment in values.yaml. (cherry picked from commit ab4cddf) Signed-off-by: Nikhil Kulkarni <nikkulkarni@nvidia.com> Co-authored-by: Nikhil Kulkarni <nikkulkarni@nvidia.com> (cherry picked from commit b1ea5e8) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 51d5caf) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

* docs: refresh v2.6 support guidance Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> * docs: tighten reasoning and mig guidance Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> --------- Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 1075cb3) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit e5602db) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

* chore: update blueprint container registry paths Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> * ci: tag publish images for staging registry Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> --------- Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 8b6f492) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

coderabbitai · 2026-06-01T19:25:06Z

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

This PR adds skill-eval automation, updates CI and release workflows, shifts deployment defaults toward Elasticsearch and SeaweedFS, introduces frontend agentic-mode and reasoning-stream support, adds an OpenClaw plugin package, refreshes 2.6.0 documentation and examples, and includes small example, SPDX, and configuration updates.

Changes

CI and release automation

Layer / File(s)	Summary
Workflow defaults and review config `.coderabbit.yaml`, `.github/workflows/run-branch-script.yml`	Code review automation settings and manual workflow defaults are updated.
Skill evaluation pipeline `.github/workflows/skills-eval.yml`, `.github/skill-eval/`, `ci/run_skill_eval.sh`	A workflow, agent prompt, Python runner, and shell harness are added to run skill evaluations on PR, schedule, and manual triggers.
Main CI pipeline updates `.github/workflows/ci-pipeline.yml`	CI uses newer actions, uv-based unit tests, Elasticsearch-first integration setup, Milvus-specific sequences, and expanded integration result handling.
Publishing and CVE automation `.github/workflows/publish-artifacts.yml`, `ci/post-cve-report.sh`, `ci/publish_wheel.sh`	Publishing jobs add branch gating, retagging, semver normalization, and NGC cleanup, and a new script updates the nightly CVE tracker issue and patch comment.

Deployment defaults and runtime configuration

Layer / File(s)	Summary
Compose and workbench defaults `deploy/compose/*`, `deploy/workbench/compose.yaml`, `deploy/workbench/quickstart.ipynb`	Compose and workbench stacks switch defaults toward Elasticsearch, SeaweedFS, updated model endpoints, renamed OCR services, VLM services, and named persistent volumes.
Helm values and templates `deploy/helm/nvidia-blueprint-rag/*`	Helm chart metadata, values, templates, NIM resources, Elasticsearch credential wiring, and OpenShift resources are updated for the 2.6.0 deployment layout.
Prompts, dashboards, and MIG layouts `deploy/helm/.../files/prompt.yaml`, `deploy/config/agentic-rag-metrics-dashboard.json`, `deploy/helm/mig-slicing/*`	Prompt templates are revised, an agentic Grafana dashboard is added, and H100/RTX6000 MIG layouts and placement values are redefined.
Runtime support files `deploy/compose/vectordb.yaml`, `deploy/compose/seaweedfs-config/s3.json`, `deploy/compose/nemoguardrails/...`, `deploy/helm/.../files/sitecustomize.py`	Vector store infrastructure, SeaweedFS config, guardrails streaming output, and nv-ingest CPU/Ray patching support are added or updated.

Frontend agentic request and reasoning UI

Layer / File(s)	Summary
Agentic request contracts and settings `frontend/src/types/*`, `frontend/src/store/useSettingsStore.ts`, `frontend/src/hooks/useCitationUtils.ts`, `frontend/src/hooks/useMessageSubmit.ts`	Frontend types and settings add explicit `agentic` mode, reasoning and metrics fields, citation stage formatting, and backend-aware filter compilation.
Streaming and chat UI `frontend/src/hooks/useChatStream.ts`, `frontend/src/components/chat/*`	Streaming parsing now handles agentic events and reasoning traces, and chat UI adds a pipeline selector plus collapsible reasoning panels.
Frontend test coverage and polish `frontend/src/components/.../__tests__/`, `frontend/src/hooks/__tests__/`, `frontend/src/components/filtering/FilterGenerationToggle.tsx`	Tests are added or updated for agentic mode, reasoning panels, streaming parsing, request cleanup, citation stage behavior, and related UI defaults.

OpenClaw plugin package

Layer / File(s)	Summary
Plugin package and entrypoint `.openclaw/index.ts`, `.openclaw/openclaw.plugin.json`, `.openclaw/package.json`, `.openclaw/tsconfig.json`, `.openclaw/.gitignore`	A new OpenClaw plugin package is added with registration logic for workspace setup, optional skill guidance, and gateway systemd patching.
Workspace templates and docs `.openclaw/README.md`, `.openclaw/workspace/*`	Plugin setup documentation and workspace bootstrap, identity, agent, soul, and tools templates are added.

Documentation and release narrative

Layer / File(s)	Summary
Release and navigation updates `README.md`, `docs/index.md`, `docs/readme.md`, `docs/release-notes.md`, `docs/migration_guide.md`, `docs/documentation.md`, `AGENTS.md`, `CLAUDE.md`	Top-level docs, navigation, release notes, migration notes, and contributor guidance are updated for 2.6.0.
Agentic, API, and reasoning docs `docs/agentic-rag.md`, `docs/api_reference/*`, `docs/custom-metadata.md`, `docs/enable-nemotron-thinking.md`, `docs/observability.md`, `docs/performance-benchmarking.md`	Documentation adds agentic RAG guidance, updates API schemas, documents Elasticsearch filter generation, revises reasoning behavior, and adds benchmarking guidance.
Deployment and operations docs `docs/change-vectordb.md`, `docs/elasticsearch-configuration.md`, `docs/deploy-*`, `docs/milvus-configuration.md`, `docs/nemoretriever-ocr.md`, `docs/troubleshooting.md`, `docs/text_only_ingest.md`, `docs/retrieval-only-deployment.md`	Deployment, vector database, OCR, troubleshooting, and ingestion docs are rewritten around Elasticsearch, SeaweedFS, new service names, and OpenShift support.
Model and multimodal docs `docs/change-model.md`, `docs/multimodal-*`, `docs/vlm.md`, `docs/python-client.md`, `docs/model-profiles.md`, `docs/image_captioning.md`	Model, multimodal retrieval, query, captioning, VLM, and client examples are updated for new defaults and service combinations.

Examples and small repository updates

Layer / File(s)	Summary
Example behavior updates `examples/nvidia_rag_mcp/*`, `examples/rag_event_ingest/kafka_consumer/models/__init__.py`	The MCP example updates streaming parsing and Elasticsearch endpoint examples, and the event-ingest example adds explicit model exports.
Header and minor housekeeping changes `examples/rag_event_ingest/...`, `examples/rag_react_agent/...`, `frontend/src/components/...`, `frontend/src/pages/__tests__/Chat.test.tsx`	Many files receive SPDX header additions or small non-behavioral adjustments.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

NVIDIA-AI-Blueprints/rag#617 — adds ci/post-cve-report.sh, which updates the Nightly CVE Scan Tracker issue and related patch comment flow.

Possibly related PRs

NVIDIA-AI-Blueprints/rag#573 — both PRs add the OpenClaw “RAG Claw” plugin package, entrypoint, manifest, and docs.
NVIDIA-AI-Blueprints/rag#599 — both PRs update README.md AI skill documentation around rag-eval and rag-perf.
NVIDIA-AI-Blueprints/rag#509 — both PRs update VLM-related deployment defaults and captioning/model configuration in compose files.

Suggested reviewers

nv-pranjald

Poem

🐇 I hopped through configs, charts, and streams,
and stitched new thoughts into the beams.
Elasticsearch now hums along,
while agentic traces sing their song.
With docs and claws and tests in tow,
this burrow’s grown for 2.6.0.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/release-v2.6.0-to-main-20260530

github-actions · 2026-06-01T20:51:26Z

Harbor Eval — `skill-source/.agents/skills/rag-blueprint/eval/nvidia_hosted.json`

Head: 44ccc49 · spec e32abe9
First started: 2026-06-01T19:30:32Z · Last finished: 2026-06-01T19:48:25Z · Total: 17min 52s

Platform	Step	Query	Result	Reward	Duration	Turns
cpu	step-1	Deploy NVIDIA RAG Blueprint using Docker Compose in NVIDIA-hosted mode…	⚠️ 0.833 (5/6)	0.833	16m 5s	63
cpu	step-2	Verify the deployed RAG stack is healthy and the API is reachable…	⚠️ 0.800 (4/5)	0.800	1m 9s	13

Failing checks

cpu / step-1 — `docker ps --format '{{.Names}}' | grep -E '^(rag-server|ingestor-server|milvus-standalone|milvus-etcd|milvus-minio)$' | wc -l` expected ≥ 5 but only returned 2. The agent deployed the RAG stack with Elasticsearch instead of Milvus, so milvus-standalone, milvus-etcd, and milvus-minio containers were absent.
cpu / step-2 — `docker ps --format '{{.Names}}\t{{.Status}}' | grep -E '(milvus-standalone|rag-server|ingestor-server)' | grep -v 'Up' | wc -l` expected 0 (all core containers Up), but milvus-standalone was not running (absent). Follows from the step-1 Elasticsearch substitution.

_{Generated by the RAG skills-eval agent. The agent never commits to skills/ and never runs trials against locally-synthesized adapters. Trial results in workflow artifact skills-eval-results-pr-657-26777005947.tar.gz.}

github-actions · 2026-06-01T20:51:27Z

Harbor Eval — `skill-source/.agents/skills/rag-blueprint/eval/h100.json`

Head: 44ccc49 · spec e32abe9
First started: 2026-06-01T20:06:12Z · Last finished: 2026-06-01T20:50:30Z · Total: 44min 18s

Platform	Step	Query	Result	Reward	Duration	Turns
H100_x2	step-1	Deploy NVIDIA RAG Blueprint in self-hosted mode using Docker Compose…	⚠️ 0.800 (4/5)	0.800	38m 21s	96
H100_x2	step-2	Verify the self-hosted RAG stack is fully operational. Check that rag-server, ingestor-server, and local NIM endpoints are all healthy…	✅ 1.0 (4/4)	1.000	5m 27s	34

Failing checks

H100_x2 / step-1 — "The agent's trajectory shows it read the rag-blueprint SKILL.md before taking action" — The skill's SKILL.md content was injected into the agent prompt context rather than read via an explicit Read tool call. The agent did not invoke Read on a SKILL.md file; the document was available in conversation context (via the task dataset). All other deployment checks (containers up, NIMs running, no containers in bad state) passed.

_{Generated by the RAG skills-eval agent. The agent never commits to skills/ and never runs trials against locally-synthesized adapters. Trial results in workflow artifact skills-eval-results-pr-657-26777005947.tar.gz.}

kheiss-uwzoo

LGTM

chore: prepare release-v2.6.0 sync to main

e32abe9

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

chore: keep release image paths staged

8a6d690

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

shubhadeepd requested review from kbrongo, kumar-punit, niyatisingal, nv-nikkulkarni, nv-pranjald, smasurekar and sumitkbh May 29, 2026 19:05

shubhadeepd self-assigned this May 29, 2026

shubhadeepd added documentation Improvements or additions to documentation enhancement New feature or request tests Security labels May 29, 2026

shubhadeepd changed the title ~~[codex] Sync release-v2.6.0 to main~~ [v2.6.0] Sync release-v2.6.0 to main May 29, 2026

shubhadeepd and others added 6 commits May 30, 2026 01:28

fix: move vlm reranker host port (#656)

bcc33e4

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit 51d5caf) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

chore: update Vite lockfile to 6.4.2 (#660)

4332dbe

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com> (cherry picked from commit e5602db) Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

ci: enable coderabbit reviews for draft prs

44ccc49

Signed-off-by: Shubhadeep Das <shubhadeepd@nvidia.com>

shubhadeepd marked this pull request as ready for review June 1, 2026 19:28

kheiss-uwzoo reviewed Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v2.6.0] Sync release-v2.6.0 to main#657

[v2.6.0] Sync release-v2.6.0 to main#657
shubhadeepd wants to merge 8 commits into
mainfrom
codex/release-v2.6.0-to-main-20260530

shubhadeepd commented May 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented May 29, 2026

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

kheiss-uwzoo left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shubhadeepd commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Current Branch State

Sync Strategy

Commits In This PR

Main-Only Changes Preserved

CI and automation

Examples

Documentation and release history

Deployment helpers

Image Path Decision

Main-Only Changes Not Restored

Validation Performed

Reviewer Notes

Operational Note

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot Bot commented May 29, 2026

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions Bot commented Jun 1, 2026

Harbor Eval — skill-source/.agents/skills/rag-blueprint/eval/nvidia_hosted.json

Failing checks

Uh oh!

github-actions Bot commented Jun 1, 2026

Harbor Eval — skill-source/.agents/skills/rag-blueprint/eval/h100.json

Failing checks

Uh oh!

kheiss-uwzoo left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shubhadeepd commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Harbor Eval — `skill-source/.agents/skills/rag-blueprint/eval/nvidia_hosted.json`

Harbor Eval — `skill-source/.agents/skills/rag-blueprint/eval/h100.json`