Skip to content

DF.__setitem__ creates extension column when given extension scalar #34875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Jul 11, 2020
Merged

DF.__setitem__ creates extension column when given extension scalar #34875

merged 42 commits into from
Jul 11, 2020

Conversation

justinessert
Copy link
Contributor

@justinessert justinessert commented Jun 19, 2020

This PR is in response to Issue 34832.
These changes follow the suggestions from reviewers on the PR to fix an issue where df['a']=pd.Period('2020-01') would create an object column instead of a period[M] column.

One potential issue that I see with these changes is that this code requires the shape parameter passed into cast_scalar_to_array to be an int, even though the functions docstring claims that it accepts a tuple.

I'm happy to work on resolving that issue, but first wanted to clarify that it is necessary. Based on my perspective, there would be no reason to call this function to create a multi-dimensional array, but I surely don't understand the full interconnectivity of this package.

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@pep8speaks
Copy link

pep8speaks commented Jun 19, 2020

Hello @justinessert! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-07-10 22:53:42 UTC

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Can you add a release note to 1.1.0.rst under bug fixes?

@gfyoung gfyoung added Bug ExtensionArray Extending pandas with custom dtypes or arrays. Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 20, 2020
@justinessert
Copy link
Contributor Author

Thanks @TomAugspurger for all your time in reviewing this PR and the original issue. Everything looks to be passing and I think I have addressed all your comments, please lmk if you would like to see any other changes!

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only a brief look
but this patch needs some work

too much special casing

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Is the change in the DataFrame constructor also tested?

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment on the test, but I think the change here look good.

@justinessert
Copy link
Contributor Author

Can someone help me understand what this recent test failure is? I could be wrong, but it doesn't seem to be a unit test or formatting error

@TomAugspurger
Copy link
Contributor

I restarted the job that crashed.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments, but generally looks good to me now!

@jorisvandenbossche
Copy link
Member

@jreback all good?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry thought i left these comments a while back

@justinessert justinessert requested a review from jreback July 10, 2020 17:03
),
],
)
def test_constructor_period_data(self, data, dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change the name -> test_constructor_scalar_data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

value, len(self.index), infer_dtype
)
else:
value = cast_scalar_to_array(len(self.index), value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the code and this is easily possible. please try to do as I have described. I don't see any reason not to.

@jreback
Copy link
Contributor

jreback commented Jul 10, 2020

ping on green.

@jreback jreback added this to the 1.1 milestone Jul 10, 2020
@jreback jreback merged commit 331093e into pandas-dev:master Jul 11, 2020
@jreback
Copy link
Contributor

jreback commented Jul 11, 2020

thanks @justinessert

@jorisvandenbossche
Copy link
Member

Thanks a lot @justinessert ! (and sorry for the difficult back and forth discussion)

@justinessert
Copy link
Contributor Author

No problem!

@jorisvandenbossche @jreback @TomAugspurger Thanks for all your help in getting this PR through!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug ExtensionArray Extending pandas with custom dtypes or arrays. Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants