fix(tools): fail StrReplaceFile when a multi-edit hunk is unmatched#2452
fix(tools): fail StrReplaceFile when a multi-edit hunk is unmatched#2452Osamaali313 wants to merge 2 commits into
Conversation
StrReplaceFile applied every edit with str.replace and then only errored
when the *entire* result was unchanged (`content == original_content`).
So in a multi-edit call where some edits matched and others did not, the
non-matching edits were silently skipped and the tool still reported
success ("Applied N edit(s)"). The model believes all hunks landed when
some never did — a silent loss of requested changes that is easy to hit
when the file has drifted from what the model last read.
Apply edits sequentially and validate that each edit's old string is
present at its turn (a later edit may legitimately target text an earlier
edit introduced). If any edit does not match, fail the whole call without
writing, so the edit batch is atomic and the failure is surfaced with the
offending edit index.
This also corrects the reported replacement count, which was computed
against the original content for every edit and therefore wrong for
sequential dependent edits (e.g. foo->bar then bar->baz reported 1
instead of 2).
Adds tests for the atomic-failure and sequential-dependent-edit cases.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates the string-replace file tool to enforce atomic multi-edit behavior (fail the whole call if any edit doesn’t match) and to treat edits as sequential transformations, with tests covering these behaviors.
Changes:
- Enforce atomic failure when any edit in a batch can’t find its
oldstring at the time it applies. - Apply edits sequentially and compute a sequential “total replacements” count.
- Add tests for atomic multi-edit failure and sequential dependent edits/counts.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| tests/tools/test_str_replace_file.py | Adds tests for atomic multi-edit failure and sequential dependent edits with count messaging. |
| src/kimi_cli/tools/file/replace.py | Implements sequential edit application with per-edit validation and updated replacement counting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Defensive: an edit whose old string equals its new string matches | ||
| # but changes nothing, so guard the no-op case explicitly. | ||
| if content == original_content: | ||
| return ToolError( | ||
| message="No replacements were made. The old string was not found in the file.", |
| total_replacements = 0 | ||
| for index, edit in enumerate(edits): | ||
| occurrences = content.count(edit.old) | ||
| if occurrences == 0: | ||
| detail = ( | ||
| "The old string was not found in the file." | ||
| if len(edits) == 1 | ||
| else ( | ||
| f"Edit #{index + 1}'s old string was not found " | ||
| "(an earlier edit in this call may have changed that text)." | ||
| ) | ||
| ) | ||
| return ToolError( | ||
| message=( | ||
| f"No replacements were made. {detail} " | ||
| "The file contents may be out of date; " | ||
| "use the Read tool to reload it." | ||
| ), | ||
| brief="No replacements made", | ||
| ) | ||
| content = self._apply_edit(content, edit) | ||
| total_replacements += occurrences if edit.replace_all else 1 |
| if content == original_content: | ||
| return ToolError( | ||
| message="No replacements were made. The old string was not found in the file.", |
| if content == original_content: | ||
| return ToolError( | ||
| message="No replacements were made. The old string was not found in the file.", |
There was a problem hiding this comment.
🟡 Defensive no-op check produces factually incorrect error message
After the new per-edit validation loop confirms that each edit.old IS present in the content (line 145-146), the defensive check at line 168-172 can still trigger when edits produce content identical to the original (e.g., edit.old == edit.new, or edits that cancel each other out). In these cases, the error message "The old string was not found in the file." is factually wrong — the old strings were all found and matched. This misleading message will cause the AI model to re-read the file looking for a string that is actually present, potentially looping without understanding the real issue (that the edit is a no-op).
(Refers to lines 168-172)
Was this helpful? React with 👍 or 👎 to provide feedback.
| occurrences = content.count(edit.old) | ||
| if occurrences == 0: |
There was a problem hiding this comment.
🚩 Empty edit.old string passes validation (pre-existing behavior)
If edit.old is an empty string, content.count("") returns len(content) + 1 (Python's behavior for counting empty substrings), so the occurrences == 0 check will never trigger. Then content.replace("", new_text) inserts new_text between every character. This is pre-existing behavior (the old code also applied edits without validating against empty old), but is worth noting as a potential edge case the model could trigger. Whether to guard against this is a design decision.
Was this helpful? React with 👍 or 👎 to provide feedback.
Address review feedback on the multi-edit validation:
- Empty old_string: `str.count('')` returns `len(content) + 1` (never 0),
so the occurrence check could not catch it and `str.replace('', new)`
would splice `new` between every character. Reject an empty old string
per edit with a clear message.
- No-op edits: when every edit matches but the result equals the original
(old_string == new_string, or edits that cancel out), the final guard
previously reported "old string was not found" — factually wrong and
liable to send the model re-reading the file for a string that is
present. Report that the edit(s) left the file unchanged instead.
Adds tests for both cases.
|
Thanks @devin-ai-integration — both are valid. Addressed in 884c671:
Added tests for both ( |
Problem
StrReplaceFileapplies edits withstr.replaceand then only errors when the entire result is unchanged:So in a multi-edit call where some edits match and others don't, the non-matching edits are silently skipped and the tool still reports success:
The model is told all hunks landed when one never did — a silent loss of requested changes. This is easy to hit when the file has drifted from what the model last read (the same situation behind the frequent
old_string not foundconfusion), because a partially-stale batch looks like it succeeded.A second, smaller bug: the reported replacement count was computed against the original content for every edit, so it's wrong for sequential dependent edits (e.g.
foo→barthenbar→bazreported1instead of2, since"bar"isn't in the original"foo").Fix
Apply edits sequentially and validate each edit's
oldstring is present at its turn (a later edit may legitimately target text an earlier edit introduced). If any edit doesn't match, fail the whole call without writing, so a multi-edit is atomic and the failure is surfaced with the offending edit index. The replacement count is now tallied as edits actually apply, fixing the sequential-edit count too.Behavior preserved:
"No replacements were made").old == new) is still guarded.Tests
Added
test_replace_multiple_edits_one_not_found_is_atomic(asserts the batch fails and the file is left untouched) andtest_replace_sequential_dependent_edits_count(dependent edits apply and the count is correct). Fulltests/tools/test_str_replace_file.pypasses (15), plus the plan-mode/additional-dirs suites (19).ruff check/ruff formatclean.