test: add vLLM HTTP logprobs contract test for NeMo-Gym capture by ananthsub · Pull Request #2845 · NVIDIA-NeMo/RL

ananthsub · 2026-06-16T14:47:22Z

Summary

Adds a GPU contract test that pins the vLLM OpenAI HTTP logprobs behavior NeMo-Gym capture depends on, complementing the upstream Gym fix for top_logprobs=null: NVIDIA-NeMo/Gym#1612

With that change^, NeMo Gym's vllm_model forwards logprobs=True + return_tokens_as_token_ids=True and pins top_logprobs=0 to extract per-token ids/logprobs for training. Currnet vLLM versions compute logprobs = top_logprobs if logprobs else None, so omitting top_logprobs (default 0) or sending 0 returns logprobs, while an explicit null returns none — silently emptying captured token ids and freezing training (zero loss mask / grad_norm≈0).

The bug lives in vLLM's request→SamplingParams translation, which is only exercised over the HTTP path (the offline LLM/SamplingParams API used by the other vllm tests does not have this issue). This test drives the real server via the existing nemo_gym_vllm_generation fixture so a vLLM bump that changes the contract fails here instead of silently freezing training.

Changes

tests/unit/environments/test_nemo_gym.py: new @pytest.mark.nemo_gym test test_vllm_http_logprobs_contract, exercising top_logprobs ∈ {omitted, 0, null} against dp_openai_server_base_urls. Omitted/0 must return per-token logprobs whose "token_id:<int>" tokens parse to ints; null must return no logprobs (the divergence motivating the Gym fix — if a future vLLM makes null behave like 0, this fails and signals the workaround can be relaxed).

NeMo-Gym's vllm_model forwards logprobs=True and return_tokens_as_token_ids=True and pins top_logprobs=0 to extract per-token ids and logprobs for training. vLLM computes `logprobs = top_logprobs if logprobs else None`, so omitting top_logprobs (default 0) or sending 0 returns logprobs, while an explicit null returns none and silently empties the captured token ids (freezing training). Add a test that drives the real vLLM OpenAI HTTP server (via the existing nemo_gym_vllm_generation fixture) over top_logprobs in {omitted, 0, null}, where the logprobs/top_logprobs -> sampling-params translation actually lives. The offline LLM API does not exercise that path. This pins the contract against the supported vLLM so a version bump that changes it fails here instead of silently emptying training token ids. Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>

copy-pr-bot · 2026-06-16T14:47:26Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

ananthsub · 2026-06-16T16:40:35Z

/ok to test 7b508e5

terrykong

Thanks for adding this — it's a valuable contract test that pins the vLLM HTTP request→SamplingParams behavior (the offline LLM API tests don't exercise this path) and is designed to fail loudly on a vLLM bump that changes top_logprobs semantics, right where the silent training-freeze lives.

Verified the core reasoning against source:

vLLM v0.20.0 logprobs = self.top_logprobs if self.logprobs else None (protocol.py#L549), top_logprobs default 0; an explicit null skips validation and returns 200 with logprobs=None (validator L625-637).
The RL HTTP wrapper does assert temperature/top_p equality (vllm_worker_async.py#L718-L719), so passing them is required.

Note: I couldn't get a green local run — the shared nemo_gym_vllm_generation fixture failed in Ray worker bootstrap (_get_node_ip_and_free_port) during setup in my multi-worktree box, before the test body ran. That's environmental (affects all nemo-gym vLLM tests), not specific to this test — leaving the green run to CI/L0.

One non-blocking suggestion inline.

Generated by Claude Code

terrykong · 2026-06-16T19:51:27Z

+    null_resp = _chat(None)
+    if null_resp.status_code == 200:
+        assert null_resp.json()["choices"][0].get("logprobs") is None


tests/unit/environments/test_nemo_gym.py:362

Null case can pass vacuously on a non-200. The branch only asserts the contract when status_code == 200; for any non-200 the test passes with zero assertions. The docstring allows "rejects the request outright" as acceptable, but as written a future vLLM 5xx (or an unrelated server error) would also pass silently — the mirror image of the 200-path check this test is built around. At vLLM v0.20.0 explicit null returns 200 with logprobs=None (protocol.py#L549), so the assertion does fire today; this just hardens the rejection branch for future bumps.

Suggested change

null_resp = _chat(None)

if null_resp.status_code == 200:

assert null_resp.json()["choices"][0].get("logprobs") is None

null_resp = _chat(None)

if null_resp.status_code == 200:

assert null_resp.json()["choices"][0].get("logprobs") is None

else:

# A rejection must be a client-side validation error, not an unrelated server failure.

assert 400 <= null_resp.status_code < 500, (

f"expected null top_logprobs accepted-with-None or rejected as 4xx, "

f"got {null_resp.status_code}: {null_resp.text}"

)

Optional nit (wording only): the docstring (L303) and the L353 comment say NeMo-Gym "pins top_logprobs=0". At the pinned Gym submodule SHA f82b601 Gym actually omits top_logprobs on the capture path (vllm_model/app.py#L320-L330) — the =0 pin is what Gym PR #1612 adds. Functionally identical (omit → default 0), so purely a future-reader clarity nit; e.g. "Gym sets logprobs=True/return_tokens_as_token_ids=True; Gym PR #1612 additionally pins top_logprobs=0."

ananthsub requested a review from a team as a code owner June 16, 2026 14:47

ananthsub added the CI:Lfast Runs a fast test suite and re-use nightly `main` container (but sync dependencies to PRs version) label Jun 16, 2026

copy-pr-bot Bot temporarily deployed to public June 16, 2026 16:41 Inactive

copy-pr-bot Bot temporarily deployed to test June 16, 2026 16:43 Inactive

copy-pr-bot Bot temporarily deployed to public June 16, 2026 16:44 Inactive

copy-pr-bot Bot temporarily deployed to public June 16, 2026 16:48 Inactive

terrykong reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add vLLM HTTP logprobs contract test for NeMo-Gym capture#2845

test: add vLLM HTTP logprobs contract test for NeMo-Gym capture#2845
ananthsub wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
ananthsub:ansubramania/vllm-logprobs-contract-test

ananthsub commented Jun 16, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 16, 2026

Uh oh!

ananthsub commented Jun 16, 2026

Uh oh!

terrykong left a comment

Uh oh!

terrykong Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ananthsub commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

copy-pr-bot Bot commented Jun 16, 2026

Uh oh!

ananthsub commented Jun 16, 2026

Uh oh!

terrykong left a comment

Choose a reason for hiding this comment

Uh oh!

terrykong Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ananthsub commented Jun 16, 2026 •

edited

Loading