Skip to content

replace NotImplementedError with AttributeError #31549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

RichardBruskiewich
Copy link

@RichardBruskiewich RichardBruskiewich commented Feb 1, 2020

Fixes Python inspection of members - bug reported in #31474

Fixes Python inspection of members - bug reported in pandas-dev#31474
@pep8speaks
Copy link

pep8speaks commented Feb 1, 2020

Hello @RichardBruskiewich! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 12:1: W293 blank line contains whitespace
Line 15:1: W293 blank line contains whitespace

Comment last updated at 2020-04-21 01:39:59 UTC

@jbrockmendel
Copy link
Member

i think TypeError would be more appropriate

regardless, needs tests

@WillAyd
Copy link
Member

WillAyd commented Feb 1, 2020

Given the purpose of this method I don’t think it makes sense to change this. What issue is this fixing exactly?

Local DataFrame Exception inherits from multiple Exception targets to satisfy several use case failure modes
@WillAyd
Copy link
Member

WillAyd commented Feb 2, 2020

I am -1 on changing anything here. This looks to just be a bug with Python:

https://bugs.python.org/issue35108

So should be fixed upstream if needed

@RichardBruskiewich
Copy link
Author

RichardBruskiewich commented Feb 2, 2020

The missing function needs to throw an AttributeError to avoid breaking the Python built-in "inspect" self reflection on the class which fails obscurely since the given method is a member of the DataFrame class but previously simply throws an NotImplementedException exception rather than indicating that it is a missing member (function attribute) of the class.

OK... I concede that https://bugs.python.org/issue35108 documents this "bug" in Python. but it has been over a year ago since anyone has moved on it at their end!

I attempted to cover the missing method with a multi-inheritance "DataFrame" Exception to deal with the polymorphism of failure modes here (also inherited from the TypeError in case it is deemed valid in this context).

Not sure what delegating the issue upstream means in practical terms here. Do we mean asking the Python maintainers to get their act together or are we kicking the ball down to all the libraries that attempt to use inspect on Pandas data frames?

This "bug" is rendering the use of a CLI library - Fire (https://github.com/google/python-fire) library useless in my code when I upgrade to Pandas 0.25.1. I guess I could ask them to change their code perhaps? Specifically, the _GetMember(component, args) method in https://github.com/google/python-fire/blob/master/fire/core.py crashes with this "bug".

Regarding tests, @jbrockmendel , I leave it to the Pandas gurus here. The proposed change is a small one which fixes the above unfriendly interaction of DataFrames with inspect.getmembers(). If you need a test to code that, there is a snippet of code at the top of #31474

@RichardBruskiewich
Copy link
Author

RichardBruskiewich commented Feb 2, 2020

I have trouble considering this a Python bug. The previous Pandas 0.24.3 worked fine. 0.25.1 doesn't. Python inspect getmembers and the AttributeError exception have been implemented since Python 2.1.

The unimplemented method was tagged as @Property which means it is pretending to be an attribute. Would it make sense to simply remove this method/attribute completely if it is not really a class (method) "attribute" of DataFrames?

@MarcoGorelli
Copy link
Member

Regarding tests, @jbrockmendel , I leave it to the Pandas gurus here.

Thanks for the info you provided, but in general pull requests aren't accepted without tests (unless they're just fixing docstrings / typos), see the contributing guide

pandas is serious about testing and strongly encourages contributors to embrace test-driven development (TDD). This development process “relies on the repetition of a very short development cycle: first the developer writes an (initially failing) automated test case that defines a desired improvement or new function, then produces the minimum amount of code to pass that test.” So, before actually writing any code, you should write your tests. Often the test can be taken from the original GitHub issue. However, it is always worth considering additional use cases and writing corresponding tests.

Adding tests is one of the most common requests after code is pushed to pandas. Therefore, it is worth getting in the habit of writing tests ahead of time so this is never an issue.

@jorisvandenbossche
Copy link
Member

@RichardBruskiewich would it also fix your usecase if we simply raise a TypeError? (the custom error subclassing from both NotImplementedError, TypeError and AttributeError seems a bit strange)

@TomAugspurger
Copy link
Contributor

I'd also like to see this solved upstream. I'll try to take a bit of time today. Doesn't seem too bad.

@TomAugspurger
Copy link
Contributor

python/cpython#18330 for reference.

@RichardBruskiewich
Copy link
Author

RichardBruskiewich commented Feb 3, 2020

@RichardBruskiewich would it also fix your usecase if we simply raise a TypeError? (the custom error subclassing from both NotImplementedError, TypeError and AttributeError seems a bit strange)

Alas, not it would not suffice to use 'TypeError' since the core behaviour of inspect.getmembers() is to trap AttributeError exceptions while it iterates through attributes. See the code snippet in between 345 and 362 in https://github.com/python/cpython/blob/master/Lib/inspect.py.

If Pandas would simply raise the AttributeError either in lieu of the NotImplementedError or alongside (see my attempt to create a polymorphic Exception class by parental inheritance), then inspect would work as expected.

My problem is that at least one other significant third party library - the Python (Google-hosted) Fire CLI library - applies inspect.getmembers to interrogate the data types flowing through its CLI processing mechanism. See line 635 inhttps://github.com/google/python-fire/blob/master/fire/core.py.

It's an "all or none" process which currently propagates the NotImplementedError exception which means the code fails completely with Pandas 0.25.1 version DataFrames (but not with the earlier 0.24.2 release) because of this flaw in the code, wherever we collectively decide to place the blame.

I know that this feels like a hack and a bandaid solution, but when I came across the idea of a polymorphic Exception definition, it seemed acceptable to "solve" the issue this way. i.e.

class DataFrameError(NotImplementedError, AttributeError, TypeError):
pass
...
def theNaughtyMethod:
raise DataFrameError("I'm being cantankerous!")

Methods throwing more than one class of exception are not unheard of in the world of computing, although of course, each exception is generally considered to convey a unique failure condition; however, in this case, it is simply a matter of getting the code to be a good citizen in the wild west of computing, especially when the sheriff in town happens to be called Python and different Deputy libraries really want to use Python methods to interrogate Pandas.

May I also make one more comment in closing? I'm not too familiar with the full semantics of the @Property decorator, but my simpleminded understanding is that it is a succinct way of adding a getter, setter and deleter to a class variable (not a method) which is an "attribute".

At this moment, I'm actually wondering out loud why the @Property decorator is even required (or appropriate) on the _constructor_expanddim (or even the _constructor) method(s) since such methods don't have getter, setter and deletor character to them? I'm wondering if the inspect.getmembers would simply ignore these class methods since they are not really class attributes in the true sense of the word.

Anyhow, enough said. I hope that somebody in the Python / Pandas / Fire / ?? community can fix this "bug" since it is disabling my own real world software system being built for translational medicine for the US National Institutes of Health (just to put a context on my dilemma...).

@jorisvandenbossche
Copy link
Member

@TomAugspurger Contribution to CPython, cool! :)

Now, personally, I wouldn't mind changing the error in pandas, if that solves some downstream annoyances now (the python fix will only be available in a further future).

In the end, do we really care which exception is being raised? In principle, this should never be called, or the user should never see this error, IIUC?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Feb 3, 2020

Right, DataFrame._constructor_expanddim should never be called in principle. So if we're going for 100% correctness, an AssertionError might be more appropriate.

That said, since this is never called (in theory) I don't think we care what's returned. I'd be fine with just returning None, or perhaps something like

@property
def _constructor_expanddim(self):
    def constructor(*args, **kwargs):
        raise AssertionError("Not supported for DataFrames!")
    return constructor

to please mypy.

@TomAugspurger
Copy link
Contributor

@RichardBruskiewich for a test, I think something simple like

def test_get_members():
    inspect.getmembers(pd.DataFrame())

will suffice. That can go somewhere like pandas/tests/frame/test_subclass.py. We may need a similar fix for Series and constructor_sliced.

@jorisvandenbossche
Copy link
Member

I think your code example with returning a function that raises an error is actually a good idea, that also maps with the expected return value.

@TomAugspurger
Copy link
Contributor

@RichardBruskiewich do you have time to update this PR with something like #31549 (comment)?

@jbrockmendel
Copy link
Member

medicine for the US National Institutes of Health

While I agree with others that this should be fixed upstream, I find myself motivated to help out with this short-term.

@RichardBruskiewich Is Series._constructor_sliced also an issue, or just DataFrame._constructor_expanddim?

@RichardBruskiewich
Copy link
Author

RichardBruskiewich commented Apr 21, 2020

Hello everyone,
Going back to the original bug report #31474:

Problem description
A "Not supported for DataFrames Exception" is triggered by a simple inspect.getmembers() call on a DataFrame instance. There are some useful libraries like the Python 'fire' library which are disabled by this bug, which was not present in Pandas Release 0.24.2 but emerged in 0.25.0

Whatever solves this problem is fine by me. I need to mention now that the code I wrote that depended on Python Fire ( a Google hosted project?) simply wouldn't work unless I pinned it to 0.24.2.

That same project didn't get funded for further development this spring, so the code base is frozen for posterity, and I've moved onto other funded projects. Thus, my own personal motivation to get the bug fixed has faded a bit. However, it does seem to me that such a major impedance mismatch between Pandas and such a useful Google-hosted CLI library should somehow be mitigated.

Somebody with more intimate knowledge about these code bases: please simply do the needful!

@RichardBruskiewich
Copy link
Author

def test_get_members():
    inspect.getmembers(pd.DataFrame())

Added this one test to the patch-1 pull request submitted

@jreback jreback added this to the 1.1 milestone Apr 21, 2020
@jreback
Copy link
Contributor

jreback commented Apr 21, 2020

closing in favor of #33628

@jreback jreback closed this Apr 21, 2020
@TomAugspurger
Copy link
Contributor

@RichardBruskiewich I think that #33628 has the preferred approach.

That was just merged to master, if you want to try it out. I don't think the changes here should be needed any more.

@RichardBruskiewich
Copy link
Author

RichardBruskiewich commented Apr 22, 2020

@RichardBruskiewich I think that #33628 has the preferred approach.

That was just merged to master, if you want to try it out. I don't think the changes here should be needed any more.

Thank you Tom, for the update.

I updated my code to use the latest master branch and tested things. Seemed to work fine. Thanks! All I need to do now is update to the next official release of Pandas which will include the patch.

Sorry that I lacked the experience with the code base and time to clean up my own recommended patches. I'll see if the latest master works in my Fire infested project.

Sincere regards, R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not supported for DataFrames Exception triggered by a simple inspect.getmembers() call on a DataFrame
8 participants