Skip to content

ENH: Add index to output of assert_series_equal on category and datetime values #33575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 26, 2020

Conversation

amilbourne
Copy link
Contributor

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

This is really an extension of #31435, which added index output to the messages produced when series (and DataFrames) are compared. I failed to notice that categorical values and datetime values were evaluated through a different code path and didn't add the index output for these data types. Since the original problem (index reordering) could just as easily happen for these types the functionality should be there as well. Also, it makes the output more consistent.

I added a couple of new tests to surface and check the additional output.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment but not a huge blocker. Otherwise lgtm @jreback

)
elif is_extension_array_dtype(left.dtype) and is_extension_array_dtype(right.dtype):
assert_extension_array_equal(left._values, right._values)
assert_extension_array_equal(
left._values, right._values, index_values=np.asarray(left.index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have to use asarray here or could you just pass back the DatetimeArray? Would make for nicer output if the date time values were shown

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point which I hadn't thought of. However, it isn't quite as simple as that because we only know that the data is an extension array at this point - the index may not be. Also, my understanding of extension array is very poor, but I thought it could be used for things other than datetimes?

The main reason though, is that the data is currently dislayed as numeric timestamps, so it seems reasonable to fix the index display at the same time as the data. And that sounds like a seperate PR to me. I am looking at doing it (when I get time), but as mentioned above, I need to figure out extension arrays first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thanks for looking at the PR. Still not sure if I'm supposed to click 'Resolve conversation' now.

@WillAyd WillAyd added the Testing pandas testing functions or related to the test suite label Apr 22, 2020
@jreback jreback added this to the 1.1 milestone Apr 23, 2020
@jreback jreback merged commit c9faed4 into pandas-dev:master Apr 26, 2020
@jreback
Copy link
Contributor

jreback commented Apr 26, 2020

thanks @amilbourne

rhshadrach pushed a commit to rhshadrach/pandas that referenced this pull request May 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants