-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG/API: other object type check in Series/DataFrame.equals #34402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG/API: other object type check in Series/DataFrame.equals #34402
Conversation
I vaguely recall having a similar discussion on this recently, but I don't recall the conclusion. Perhaps it was on the PR changing Checking that the Side-note: we should consider changing |
Yes, we recently had the discussion about whether the dtype should be equal -> #33940 (which also touched on the subclasses) I am fine with exact
That's would actually be fine for me (geopandas is also a good test case ;), we just didn't have any "equals" tests). |
is there a test that triggers this (so we don't regress) |
makes sense |
@jbrockmendel there are multiple options being discussed, so can you be more explicit in what makes sense to you? |
For example, do we want exact type equality, or are subclasses fine to be considered equal? |
I was referring to the edit in the PR. Haven't formed a strong opinion otherwise. |
@jreback since you keep adding back the Reshaping label, what has the |
For now, I went with |
why do u keep taking it off? this doesn’t have any labels and is weird otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a whatsnew note, yes?
pandas/tests/generic/test_generic.py
Outdated
|
||
class MySeries(pd.Series): | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put this in tests/series/test_sublcass and similar for frame/test_sublcass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a test about the subclass working correctly (anything related to the subclassing machinery, eg correct use of _constructor
to preserve the subclass in operations etc), but only about our own equals method accepting subclasses, so I think the test rather fits here with the other equals
tests.
But I can certainly use the tm.SubclassedSeries
helper to avoid creating a dummy class here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this belongs with the rest of the tests & needs for DataFrame subclass as well. having 2 places is just very friendly to future folks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having 2 places where equals
is tested is equally unfriendly I think ..
Both ways have some aspect of the tests splitted in two places, and as mentioned above, this test is actually about our equals
implementation, not about the behaviour of a subclassing mechanism, and thus more belongs here IMO.
Will add dataframe to the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls just move this, ALL subclass tests are in the same file. This is breaking the pattern for no reason.
All good here? |
no i asked to move the test quite a while ago |
Fixed the merge conflict. |
thanks @jorisvandenbossche |
xref geopandas/geopandas#1420
First, this PR is fixing the "bug" that we shouldn't rely on
_constructor
being a class that can be used inisinstance
(see #32638 for the general discussion about this, this is the only place in our code where_constructor
is used like this, AFAIK).And even if
_constructor
would be a class, it wouldn't necessarily be the correct class to check with (or not more correct thantype(self)
)But, so this also brings up the API question: what are the "requirements" we put on the type of
other
? Should it be the same type? (as then could also change it toif not type(self) is type(other): ...
)Or is a subclass sufficient (with
isinstance
) ?The problem with an isinstance checks with subclasses is that then the order matters (eg
subdf.equals(df)
would not necessarily give the same answer asdf.equals(subdf)
)