Skip to content

Adding allow_pickle argument#8875

Open
ericspod wants to merge 4 commits into
Project-MONAI:devfrom
ericspod:GHSA-wg9g-w2j2-8pgr
Open

Adding allow_pickle argument#8875
ericspod wants to merge 4 commits into
Project-MONAI:devfrom
ericspod:GHSA-wg9g-w2j2-8pgr

Conversation

@ericspod
Copy link
Copy Markdown
Member

Addresses GHSA-wg9g-w2j2-8pgr.

Description

This adds allow_pickle=False argument to NumpyReader. This addresses the security concern but allows users to override the setting to load pickled data if needed.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
@ericspod ericspod requested review from KumoLiu and Nic-Ma as code owners May 27, 2026 10:44
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0a954fa8-6a30-4067-b29d-58571b633b21

📥 Commits

Reviewing files that changed from the base of the PR and between 11f9bcc and 4788503.

📒 Files selected for processing (1)
  • monai/data/image_reader.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • monai/data/image_reader.py

📝 Walkthrough

Walkthrough

NumpyReader gains an allow_pickle constructor parameter (default False) stored on the instance and forwarded to np.load() instead of always using True. Tests assert that loading pickled .npy files or invalid mmap_mode raises ValueError without the flag and succeed when allow_pickle and valid mmap_mode are provided.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: adding an allow_pickle argument to NumpyReader.
Description check ✅ Passed Description covers the security advisory reference, implementation details, and relevant checkbox updates matching the changes made.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/data/test_numpy_reader.py (1)

77-107: ⚡ Quick win

Add one pickled .npz test for parity.

NumpyReader.read() now gates pickle loading for both .npy and .npz, but the new assertions only exercise .npy. Add one .npz object-array case for default failure + allow_pickle=True success.

As per coding guidelines, "Ensure new or modified definitions will be covered by existing or new unit tests."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/data/test_numpy_reader.py` around lines 77 - 107, Add a new test
mirroring test_npy_pickle but using a pickled .npz archive: create a dict saved
with np.savez_compressed (or np.savez) containing an object-array, then assert
that NumpyReader.read()/get_data() raises ValueError by default and that
constructing NumpyReader(allow_pickle=True) (and calling read with the same mmap
kwargs used in test_kwargs) successfully returns the original data; reference
the existing test_npy_pickle and test_kwargs patterns and the NumpyReader.read /
NumpyReader.get_data calls to implement the default-fail +
allow_pickle=True-success assertions for a .npz file.
monai/data/image_reader.py (1)

1234-1235: ⚡ Quick win

Document the default failure mode explicitly.

Please add a Raises: note for the default allow_pickle=False path (pickled content load failure), so callers can handle it intentionally.

As per coding guidelines, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@monai/data/image_reader.py` around lines 1234 - 1235, Update the docstring
for the function that accepts the allow_pickle parameter (the doc in
monai/data/image_reader.py) to explicitly document the default failure mode: add
a "Raises:" section stating that when allow_pickle=False (the default) loading
files that contain pickled/object arrays will raise a ValueError (e.g., "Object
arrays cannot be loaded when allow_pickle=False"), so callers can catch/handle
it; mention allow_pickle in that Raises entry so the behavior is clearly tied to
the parameter.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@monai/data/image_reader.py`:
- Around line 1234-1235: Update the docstring for the function that accepts the
allow_pickle parameter (the doc in monai/data/image_reader.py) to explicitly
document the default failure mode: add a "Raises:" section stating that when
allow_pickle=False (the default) loading files that contain pickled/object
arrays will raise a ValueError (e.g., "Object arrays cannot be loaded when
allow_pickle=False"), so callers can catch/handle it; mention allow_pickle in
that Raises entry so the behavior is clearly tied to the parameter.

In `@tests/data/test_numpy_reader.py`:
- Around line 77-107: Add a new test mirroring test_npy_pickle but using a
pickled .npz archive: create a dict saved with np.savez_compressed (or np.savez)
containing an object-array, then assert that NumpyReader.read()/get_data()
raises ValueError by default and that constructing
NumpyReader(allow_pickle=True) (and calling read with the same mmap kwargs used
in test_kwargs) successfully returns the original data; reference the existing
test_npy_pickle and test_kwargs patterns and the NumpyReader.read /
NumpyReader.get_data calls to implement the default-fail +
allow_pickle=True-success assertions for a .npz file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ba124431-25c2-4779-bdb3-a1dbf260e9e2

📥 Commits

Reviewing files that changed from the base of the PR and between 0a8d945 and 11f9bcc.

📒 Files selected for processing (2)
  • monai/data/image_reader.py
  • tests/data/test_numpy_reader.py

Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
@ericspod ericspod mentioned this pull request May 27, 2026
36 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant