Skip to content

fix: handle category dtype for target column in _build_dataframe#823

Open
Apoorv-c wants to merge 1 commit into
mljar:masterfrom
Apoorv-c:fix/category-dtype-target
Open

fix: handle category dtype for target column in _build_dataframe#823
Apoorv-c wants to merge 1 commit into
mljar:masterfrom
Apoorv-c:fix/category-dtype-target

Conversation

@Apoorv-c
Copy link
Copy Markdown

When users convert numeric-coded columns to pandas category dtype and pass them as the target y, downstream libraries (LightGBM, etc.) raise:
ValueError: pandas dtypes must be int, float or bool.
Fields with bad pandas dtypes: target: category

Fix: in _build_dataframe, detect pd.Series with CategoricalDtype and convert to the underlying numeric or object dtype before any processing. Also fix _save_data_info to correctly identify numeric category targets.

Fixes #652

When users convert numeric-coded columns to pandas category dtype and
pass them as the target y, downstream libraries (LightGBM, etc.) raise:
  ValueError: pandas dtypes must be int, float or bool.
  Fields with bad pandas dtypes: target: category

Fix: in _build_dataframe, detect pd.Series with CategoricalDtype and
convert to the underlying numeric or object dtype before any processing.
Also fix _save_data_info to correctly identify numeric category targets.

Fixes mljar#652
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failure to properly preprocess categorical data

1 participant