Skip to content

Fix IndexError in get_most_frequent when column is all null#824

Open
TheChyeahhh wants to merge 1 commit into
mljar:masterfrom
TheChyeahhh:fix/get-most-frequent-empty
Open

Fix IndexError in get_most_frequent when column is all null#824
TheChyeahhh wants to merge 1 commit into
mljar:masterfrom
TheChyeahhh:fix/get-most-frequent-empty

Conversation

@TheChyeahhh
Copy link
Copy Markdown

Description

Fixes #770

When PreprocessingMissingValues encounters a column containing exclusively null values, x.value_counts() returns an empty Series. The subsequent sorted(...)[0] raises an IndexError: list index out of range.

Fix

Added an early return of None when value_counts() is empty, allowing the caller to proceed with a safe fallback fill value.

Testing

  • value_counts() on a non-null column → returns most frequent value (existing behavior preserved)
  • value_counts() on an all-null column → returns None (previously crashed)

Changes

supervised/preprocessing/preprocessing_utils.py — 2 lines added

When PreprocessingMissingValues._fit_na_fill is called on a column
containing exclusively null values, x.value_counts() returns an empty
Series. The subsequent sorted(...)[0] then raises:
  IndexError: list index out of range

This adds an early return of None for the empty case, allowing the
caller to proceed with a safe fallback fill value.

Fixes mljar#770
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IndexError: list index out of range

1 participant