Skip to content

ENH: Make categories setitem error more readable #46646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
galipremsagar opened this issue Apr 5, 2022 · 4 comments · Fixed by #48087
Closed
2 of 3 tasks

ENH: Make categories setitem error more readable #46646

galipremsagar opened this issue Apr 5, 2022 · 4 comments · Fixed by #48087
Assignees
Labels
Categorical Categorical Data Type Enhancement Error Reporting Incorrect or improved errors from pandas good first issue

Comments

@galipremsagar
Copy link

galipremsagar commented Apr 5, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

>>> import pandas as pd
>>> ps = pd.Series([1, 2, 3], dtype="category")

# this is the current behavior
>>> ps[0] = 5
Traceback (most recent call last):
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1085, in __setitem__
    self._set_with_engine(key, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1149, in _set_with_engine
    self._mgr.setitem_inplace(loc, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/internals/base.py", line 190, in setitem_inplace
    arr[indexer] = value
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/_mixins.py", line 249, in __setitem__
    value = self._validate_setitem_value(value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/categorical.py", line 1457, in _validate_setitem_value
    return self._validate_scalar(value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/categorical.py", line 1484, in _validate_scalar
    raise TypeError(
TypeError: Cannot setitem on a Categorical with a new category (5), set the categories first

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1140, in __setitem__
    self._set_with(key, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1167, in _set_with
    self._set_labels(key, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1179, in _set_labels
    self._set_values(indexer, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1185, in _set_values
    self._mgr = self._mgr.setitem(indexer=key, value=value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 337, in setitem
    return self.apply("setitem", indexer=indexer, value=value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 304, in apply
    applied = getattr(b, f)(**kwargs)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 1604, in setitem
    self.values[indexer] = value
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/_mixins.py", line 249, in __setitem__
    value = self._validate_setitem_value(value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/categorical.py", line 1457, in _validate_setitem_value
    return self._validate_scalar(value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/categorical.py", line 1484, in _validate_scalar
    raise TypeError(
TypeError: Cannot setitem on a Categorical with a new category (5), set the categories first

Issue Description

_validate_scalar in categorical.py is propagating an exception that could often lead to confusing error stack-trace because of the previous errors that would have occured. So instead of having the following code:

raise TypeError(
"Cannot setitem on a Categorical with a new "
f"category ({fill_value}), set the categories first"
)

If it is replace with:

raise TypeError( 
     "Cannot setitem on a Categorical with a new " 
     f"category ({fill_value}), set the categories first" 
 ) from None

The error stack-trace will be cleaner like as follows:

Expected Behavior

# This is much cleaner as we don't have the previous exceptions stack-trace. Which is actually irrelevant at this point.
>>> ps[0] = 5
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1140, in __setitem__
    self._set_with(key, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1167, in _set_with
    self._set_labels(key, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1179, in _set_labels
    self._set_values(indexer, value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/series.py", line 1185, in _set_values
    self._mgr = self._mgr.setitem(indexer=key, value=value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 337, in setitem
    return self.apply("setitem", indexer=indexer, value=value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 304, in apply
    applied = getattr(b, f)(**kwargs)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 1604, in setitem
    self.values[indexer] = value
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/_mixins.py", line 249, in __setitem__
    value = self._validate_setitem_value(value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/categorical.py", line 1457, in _validate_setitem_value
    return self._validate_scalar(value)
  File "/nvme/0/pgali/envs/cudfdev/lib/python3.8/site-packages/pandas/core/arrays/categorical.py", line 1484, in _validate_scalar
    raise TypeError(
TypeError: Cannot setitem on a Categorical with a new category (5), set the categories first

Installed Versions

Replace this line with the output of pd.show_versions()

@galipremsagar galipremsagar added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 5, 2022
@mroeschke mroeschke added Enhancement Error Reporting Incorrect or improved errors from pandas Categorical Categorical Data Type good first issue and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 6, 2022
@mroeschke mroeschke changed the title BUG: Make categories setitem error more readable ENH: Make categories setitem error more readable Jul 6, 2022
@samrao1997
Copy link
Contributor

Is this still available? I am looking to take on my first issue.

@Nikhil-Mudgal
Copy link

take

@daspartho
Copy link
Contributor

daspartho commented Aug 15, 2022

Is it available? The PR above is stale.

@daspartho
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Enhancement Error Reporting Incorrect or improved errors from pandas good first issue
Projects
None yet
5 participants