-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
update NDFrame __setattr__ to match behavior of __getattr__ #9004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jakevdp can you add some tests to validate this (e.g. your new behavior). also a release note (you can use this exact example, with a code-block for the before), in the API section pls (and reference the original issue). I think this is ok and seems to fix the 'hiding' issue. The fundamental issue is that allowing columnar convenience access via attributes collides with 'regular' attribute access (as well as possible method ovewriting). |
It occurs to me that there's another option here: we could make it so that |
the problem with that is people expect to be able to set arbitrary attributes on objects (even though the propagation does not happen) - eg except for certain attributes these are lost when combined I think that would be a nice change - but that would have to wait for 0.16 for consideration going to release 0.15.2 shortly (I think this fix is ok as its really a bug) |
@jakevdp awesome. can you add the example in the release notes as well. We can merge this and discuss whether to change at a later date (e.g. setattr on a DataFrame to create a column rather than an attribute) |
I updated the whatsnew in the most recent commit. Let me know if you want that to be modified in any way. |
This looks fine to me as a fix. I think we can safely assume that users haven't been relying on the broken behavior. In the long term, I do think we should consider just making |
Thanks for cc'ing me. One question I have is will this attribute persist? Jake, what is the return of:
In regard to subclassing Frames, I actually don't do that per-se in my package; however, I know GeoPandas does. We use a composite class, so we're not storing attributes on the DataFrame. Of course, if it ever does come to the point that the DataFrame can store arbitrary attributes, it would require us to rethink the design I'm sure. In fact, I do plan to rework this in the future to subclass like GeoPandas. But in any case, I guess to summarize my thoughts is I'd say if this PR is generally good for Pandas, then +1 and we will work around any changes you guys make. It might be helpful to get someone from GeoPandas to check this out too. |
@hugadams Nope, the attribute is not persisted through arithmetic -- this PR does not change that behavior. |
@shoyer Cool, thanks. |
Hi @hugadams – I don't think this change will affect the example you bring up. Note that, unless there's something I'm overlooking, the only thing this current fix changes is the case where:
I think that unless someone is doing that exact sequence of events, this change will not affect anything. |
|
||
try: | ||
object.__getattribute__(self, name) | ||
return object.__setattr__(self, name, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to return here (not that it really matters, but you did remove the unnecessary return below!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was either this, or this followed by return None
. I decided to go for brevity 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see why it's needed now :)
This addresses the issue raised in #8994.
Here is the behavior before this PR:
Here is the behavior after this PR:
The fix boils down to the fact that when
data.y
is called, Python first callsdata.__getattribute__('y')
before callingdata.__getattr__('y')
, but whendata.y = 5
is called, Python goes directly todata.__setattr__('y', 5)
without any prior search for defined attributes.I don't think there are any unintended side-effects to this change, but I'd appreciate some other folks thinking through whether this is the right solution. Thanks!