Skip to content

WIP: ensure names are pinned correctly #24429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from

Conversation

jbrockmendel
Copy link
Member

Discussed in #24397, this defines a metaclass for PandasObject that does two things:

  • at class-definition, iterate over all the methods in the class namespace and assert that the method's __name__ matches the namespace name.
  • define comparisons on the classes so we can simplify the NotImplemented checks to if type(self) < type(other)

The __name__ have a whole bunch that currently fail, will whittle those down.

@pep8speaks
Copy link

pep8speaks commented Dec 26, 2018

Hello @jbrockmendel! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on January 10, 2019 at 02:41 Hours UTC

@topper-123
Copy link
Contributor

topper-123 commented Dec 26, 2018

Are we sure we want pandas objects to have a metaclass?

Any Python (sub)class may only have one metaclass, so by implementing one in pandas itself we would restrict users of pandas from implementing their own metaclass on subclassed pandas objects, when/if they need it. I think that’s an important issue and makes me leaning on -1 on this.

Can’t these checks not be implemented using other methods, e.g. using simple pytests tests?

@jbrockmendel
Copy link
Member Author

Can’t these checks not be implemented using other methods, e.g. using simple pytests tests?

Yah, I came to the same conclusion and changed the name-checking bit to go into tests.base. The class inequality thing I think is neat, but is really unrelated to the rest of the PR.

@topper-123
Copy link
Contributor

The class inequality thing I think is neat, but is really unrelated to the rest of the PR.

IMO, it’s unconventional, so usage of this pattern will surprise users and require them to dig (quite deep) into the pandas source code to understand what the comparisons mean. I think just using issubclass directly is much better/clearer, and will also leave the metaclass slot open for users to use in their own subclasses.

So, I’m -1 on adding the metaclass to PandasObject...

@jbrockmendel
Copy link
Member Author

So, I’m -1 on adding the metaclass to PandasObject

While I might want to revisit it at some point, it is definitely not relevant to this PR, so just reverted it.

continue
if name in ignore:
continue
if inspect.ismethod(attr) or isinstance(attr, property):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python3 methods on (non-instantiated) classes are functions, so for those ismethod will fail. You need to add inspect.isfunction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks

if 'Subclassed' in cls.__name__:
# dummy classes defined in tests
return
for name in dir(cls):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This iteration looks heavy. What time does it take to run through all cases defined in the test_pinned_names parametrization? Should test_pinned_names below get a pytest slow marker?

Would it not be better to do specific tests for the cases where methods are generated dynamically instead of this broader test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What time does it take to run through all cases defined in the test_pinned_names parametrization? Should test_pinned_names below get a pytest slow marker?

Locally python -m pytest pandas/tests/test_base.py -k test_pinned_names takes 1.02 seconds.

Would it not be better to do specific tests for the cases where methods are generated dynamically instead of this broader test?

No, that's an invitation to have things fall through the cracks down the road.

@jbrockmendel jbrockmendel changed the title WIP: use metaclass to ensure names are pinned correctly WIP: ensure names are pinned correctly Dec 27, 2018
@codecov
Copy link

codecov bot commented Dec 28, 2018

Codecov Report

Merging #24429 into master will decrease coverage by 49.3%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #24429       +/-   ##
===========================================
- Coverage   92.31%      43%   -49.31%     
===========================================
  Files         163      163               
  Lines       51987    51987               
===========================================
- Hits        47990    22356    -25634     
- Misses       3997    29631    +25634
Flag Coverage Δ
#multiple ?
#single 43% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/core/categorical.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-98.65%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-95.46%) ⬇️
pandas/io/sas/sas7bdat.py 0% <0%> (-91.17%) ⬇️
pandas/io/sas/sas_xport.py 0% <0%> (-90.15%) ⬇️
pandas/core/tools/numeric.py 10.44% <0%> (-89.56%) ⬇️
... and 121 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e64d9c...9420548. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 28, 2018

Codecov Report

Merging #24429 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24429      +/-   ##
==========================================
+ Coverage   92.38%   92.38%   +<.01%     
==========================================
  Files         166      166              
  Lines       52310    52324      +14     
==========================================
+ Hits        48326    48341      +15     
+ Misses       3984     3983       -1
Flag Coverage Δ
#multiple 90.8% <100%> (ø) ⬆️
#single 43.08% <96.42%> (+0.01%) ⬆️
Impacted Files Coverage Δ
pandas/core/arrays/sparse.py 92.17% <100%> (ø) ⬆️
pandas/core/indexes/frozen.py 93.33% <100%> (+1.22%) ⬆️
pandas/util/testing.py 88.09% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d56db9d...602cb0e. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Dec 28, 2018

looks ok, but this raises a number of questions about existing behaviors as you have noted.

@gfyoung gfyoung added Bug Internals Related to non-user accessible pandas implementation labels Dec 30, 2018
@jbrockmendel
Copy link
Member Author

Ref #23551 need to see what it would take to handle that here

@jbrockmendel
Copy link
Member Author

Do we want to move forward with something like this? If not let’s close

@jbrockmendel jbrockmendel deleted the meta branch September 17, 2019 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants