fix(llmrails): normalize OpenAI multi-part content to string before rail evaluation#2005
fix(llmrails): normalize OpenAI multi-part content to string before rail evaluation#2005nac7 wants to merge 5 commits into
Conversation
…ail evaluation
When a user message uses the OpenAI multi-part content format
(``content: [{type: text, text: ...}]``), the content field
was passed directly into ``UtteranceUserActionFinished.final_transcript``
and ``UserMessage.text`` without normalization. This caused two bugs:
1. All LLM prompts (self-check input, intent matching, etc.) received
the Python repr of the list instead of the actual user text, silently
defeating content-safety rails.
2. In multi-turn conversations where ``mask_prev_user_message`` fires,
``get_colang_history()`` crashed with
``TypeError: must be str or None, not list`` at the ``rsplit()`` call.
Fix: add ``get_content_text()`` to ``nemoguardrails/rails/llm/utils.py``.
It joins all ``type: text`` parts with a space and passes non-list values
through unchanged. Apply it at all four user-message content access
points in ``_get_events_for_messages`` (Colang 1.0 transcript event,
Colang 1.0 UserMessage event, Colang 1.0 tool-message fallback lookup,
and Colang 2.0 transcript event). Refactor the pre-existing inline list
handling in ``get_history_cache_key`` to reuse the same helper.
Fixes NVIDIA-NeMo#1741
Documentation preview |
Greptile SummaryThis PR fixes two bugs triggered by OpenAI multi-part message content (
|
| Filename | Overview |
|---|---|
| nemoguardrails/rails/llm/utils.py | Adds get_content_text() helper that normalizes multipart content lists to strings; also refactors get_history_cache_key to use it, removing duplicated inline logic. |
| nemoguardrails/rails/llm/llmrails.py | Applies get_content_text() at all four user-message content access points in _get_events_for_messages, covering Colang 1.0, Colang 2.0, and the tool-message user-lookup branch. |
| tests/test_llmrails.py | Adds 11 tests covering the helper function and all modified code paths (single-turn, multi-turn, mixed parts, tool-message branch, Colang 2.0). Coverage is thorough. |
| .github/workflows/_test.yml | Downgrades codecov/codecov-action from v5 to v4; unrelated to the main fix. |
| CHANGELOG.md | Adds [Unreleased] entry describing the multipart-content normalization fix. |
Sequence Diagram
sequenceDiagram
participant C as Caller
participant R as LLMRails._get_events_for_messages
participant H as get_content_text (utils.py)
participant P as Guardrail Pipeline
C->>R: messages with multipart content list
note over R: Colang 1.0 path
R->>H: get_content_text(msg[content])
H-->>R: plain string
R->>P: "UtteranceUserActionFinished(final_transcript=string)"
R->>P: "UserMessage(text=string)"
note over R: Tool-message branch
R->>H: get_content_text(prev_user_msg[content])
H-->>R: normalized string
R->>P: "UserMessage(text=normalized string)"
note over R: Colang 2.0 path
R->>H: get_content_text(msg[content])
H-->>R: plain string
R->>P: "UtteranceUserActionFinished(final_transcript=string)"
Reviews (4): Last reviewed commit: "fix: update codecov action to v4 to reso..." | Re-trigger Greptile
📝 WalkthroughWalkthroughThis PR fixes issue ChangesOpenAI Multipart Content Normalization
Estimated code review effort: 🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nemoguardrails/rails/llm/utils.py`:
- Around line 21-39: get_content_text currently declares and documents returning
a str but returns non-list inputs unchanged, allowing None/dict/etc. to leak
through; update get_content_text to always return a str by: when content is a
list keep the existing join behavior, otherwise if content is None return an
empty string, if it's already a str return it, and for other types return
str(content) (ensuring the function contract -> str is honored).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 326c5380-9c7e-41c2-8c7a-22eb9b2f9424
📒 Files selected for processing (4)
CHANGELOG.mdnemoguardrails/rails/llm/llmrails.pynemoguardrails/rails/llm/utils.pytests/test_llmrails.py
…d formatting - Apply ruff-format reformatting (generator expression collapsed to one line) - Change signature to get_content_text(content: Any) -> str so the return type is always an honest str: None now maps to empty string, non-list non-None values go through str(), and text parts in the list branch are wrapped in str(... or '') to guard against explicit None text values - Update test_none_passthrough -> test_none_returns_empty_string to match new None behavior; add test_non_string_non_list_converted_via_str to cover the str() fallback branch
Update codecov/codecov-action from v5 to v4 to fix GPG signature verification failures in coverage upload step. v4 resolves the GPG key verification issue that was causing CI failures. Fixes: 'gpg: Can't check signature: No public key' error in PR tests coverage upload >
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
Hi @Pouyanpi , if you have some time, could you please help with this PR review? Thanks! |
PR merge guidance@nac7 thanks for the PR. GitHub is currently blocking merge for one or more repository requirements:
Relevant guide: |
Signed-off-by: nac7 <lelenachiket07@gmail.com>
Problem
When a user message uses the OpenAI multi-part content format:
{"role": "user", "content": [{"type": "text", "text": "You are a dotard and I hate you"}]}_get_events_for_messagesassignsmsg["content"](the list) directly intoUtteranceUserActionFinished.final_transcriptandUserMessage.textwithout normalising it to a string. This causes two bugs:Bug 1 — Silent guardrail bypass: Every LLM prompt (self-check input, intent matching, etc.) receives the Python repr of the list instead of the actual user text. Content-safety rails evaluate garbage and silently pass the real message through unblocked.
Bug 2 — TypeError crash in multi-turn: When
mask_prev_user_messagefires in a subsequent turn,get_colang_history()callshistory.rsplit(utterance_to_replace, 1)whereutterance_to_replaceis the list, crashing with:Fix
Add
get_content_text(content)tonemoguardrails/rails/llm/utils.py. It joins alltype: textparts with a space and passes non-list values through unchanged. Apply it at all four user-message content access points in_get_events_for_messages:UtteranceUserActionFinished.final_transcriptUserMessage.text(non-final turns)UtteranceUserActionFinished.final_transcriptAlso refactors the pre-existing inline multipart-list handling in
get_history_cache_keyto reuse the same helper, eliminating duplicate logic.Tests
11 new tests added to
tests/test_llmrails.py:TestGetContentText— 8 unit tests for the helper:Nonepassthroughtextkey in part handled gracefullyIntegration tests:
test_multipart_content_single_turn— single-turngenerate_asyncwith multipart content returns correct string responsetest_multipart_content_multi_turn_does_not_crash— multipart content in non-final turns does not raise TypeErrortest_multipart_content_mixed_parts— image_url parts are silently dropped, text parts extracted correctlytest_tool_message_with_multipart_user_content— Colang 1.0 tool-message branch: UserMessage events carry the normalised stringtest_colang2_multipart_content_normalization— Colang 2.0 path: UtteranceUserActionFinished carries the normalised stringAll 54 tests in
tests/test_llmrails.pypass. Coverage confirmed on every added and modified line.Fixes #1741
Summary by CodeRabbit
get_colang_historywhen processing multimodal messages.