-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DF.__setitem__ creates extension column when given extension scalar #34875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ct column when given an extension scalar
Hello @justinessert! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-07-10 22:53:42 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Can you add a release note to 1.1.0.rst under bug fixes?
Checking if extension dtype via built in function instead of manually Co-authored-by: Tom Augspurger <[email protected]>
Thanks @TomAugspurger for all your time in reviewing this PR and the original issue. Everything looks to be passing and I think I have addressed all your comments, please lmk if you would like to see any other changes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only a brief look
but this patch needs some work
too much special casing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
Is the change in the DataFrame constructor also tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment on the test, but I think the change here look good.
Can someone help me understand what this recent test failure is? I could be wrong, but it doesn't seem to be a unit test or formatting error |
I restarted the job that crashed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small comments, but generally looks good to me now!
@jreback all good? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry thought i left these comments a while back
), | ||
], | ||
) | ||
def test_constructor_period_data(self, data, dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change the name -> test_constructor_scalar_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinessert here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
value, len(self.index), infer_dtype | ||
) | ||
else: | ||
value = cast_scalar_to_array(len(self.index), value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the code and this is easily possible. please try to do as I have described. I don't see any reason not to.
ping on green. |
thanks @justinessert |
Thanks a lot @justinessert ! (and sorry for the difficult back and forth discussion) |
No problem! @jorisvandenbossche @jreback @TomAugspurger Thanks for all your help in getting this PR through! |
This PR is in response to Issue 34832.
These changes follow the suggestions from reviewers on the PR to fix an issue where
df['a']=pd.Period('2020-01')
would create an object column instead of aperiod[M]
column.One potential issue that I see with these changes is that this code requires the
shape
parameter passed intocast_scalar_to_array
to be an int, even though the functions docstring claims that it accepts a tuple.I'm happy to work on resolving that issue, but first wanted to clarify that it is necessary. Based on my perspective, there would be no reason to call this function to create a multi-dimensional array, but I surely don't understand the full interconnectivity of this package.
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff