Skip to content

ENH: improve output for testing.assert_*_equal with Categoricals #18069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 2, 2017

Conversation

topper-123
Copy link
Contributor

@topper-123 topper-123 commented Nov 1, 2017

The new CategoricalDtype made the output from failing tests uninformative, see #18056.

The error message is now:

>>> c1 = pd.CategoricalIndex(['a', 'b'])
>>> c2 = pd.CategoricalIndex(['c', 'd'])
>>> s1 = pd.Series([1,2], index=c1)
>>> s2 = pd.Series([1,2], index=c2)
>>> pd.testing.assert_series_equal(s1, s2)
AssertionError: Series.index are different

Attribute "dtype" are different
[left]:  CategoricalDtype(categories=['a', 'b'], ordered=False)
[right]: CategoricalDtype(categories=['c', 'd'], ordered=False)

Which is much better.

@topper-123 topper-123 force-pushed the categorical_repr_in_testing branch from cfd531f to 90f2aa4 Compare November 1, 2017 22:35
@@ -1074,8 +1074,12 @@ def assert_categorical_equal(left, right, check_dtype=True,
def raise_assert_detail(obj, message, left, right, diff=None):
if isinstance(left, np.ndarray):
left = pprint_thing(left)
elif pd.api.types.is_categorical_dtype(left):
Copy link
Member

@jschendel jschendel Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like is_categorical_dtype is already being imported from pandas.core.dtypes.common at the top of the file, so you should be able to just use is_categorical_dtype(left) here.

if isinstance(right, np.ndarray):
right = pprint_thing(right)
elif pd.api.types.is_categorical_dtype(right):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -117,7 +117,8 @@ Categorical
^^^^^^^^^^^

- Bug in :meth:`DataFrame.astype` where casting to 'category' on an empty ``DataFrame`` causes a segmentation fault (:issue:`18004`)
-
- Error messages in the testing module have been improved when items have
different CategoricalDtype (:issue:`18069`)
Copy link
Member

@jschendel jschendel Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add backticks to ``CategoricalDtype``? Looks like that's what was typically done in the previous whatsnew.

@topper-123 topper-123 force-pushed the categorical_repr_in_testing branch from 90f2aa4 to fde514e Compare November 2, 2017 07:23
@codecov
Copy link

codecov bot commented Nov 2, 2017

Codecov Report

Merging #18069 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18069      +/-   ##
==========================================
+ Coverage   91.25%   91.26%   +<.01%     
==========================================
  Files         163      163              
  Lines       50120    50120              
==========================================
+ Hits        45737    45740       +3     
+ Misses       4383     4380       -3
Flag Coverage Δ
#multiple 89.07% <ø> (+0.02%) ⬆️
#single 40.32% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 100% <ø> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.2% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15fa4bd...fde514e. Read the comment docs.

@topper-123
Copy link
Contributor Author

Alright, I've adjusted according to comments from @jschendel.

@jreback jreback added Categorical Categorical Data Type Testing pandas testing functions or related to the test suite labels Nov 2, 2017
@jreback jreback added this to the 0.21.1 milestone Nov 2, 2017
@jreback jreback merged commit bb4fa65 into pandas-dev:master Nov 2, 2017
@jreback
Copy link
Contributor

jreback commented Nov 2, 2017

thanks @topper-123

ghost pushed a commit to reef-technologies/pandas that referenced this pull request Nov 3, 2017
1kastner pushed a commit to 1kastner/pandas that referenced this pull request Nov 5, 2017
@topper-123 topper-123 deleted the categorical_repr_in_testing branch November 6, 2017 21:47
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Dec 8, 2017
TomAugspurger pushed a commit that referenced this pull request Dec 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: assert_* has very superficial description if CategorialIndex are different
4 participants