DF.setitem creates extension column when given extension scalar #34875

justinessert · 2020-06-19T18:54:51Z

This PR is in response to Issue 34832.
These changes follow the suggestions from reviewers on the PR to fix an issue where df['a']=pd.Period('2020-01') would create an object column instead of a period[M] column.

One potential issue that I see with these changes is that this code requires the shape parameter passed into cast_scalar_to_array to be an int, even though the functions docstring claims that it accepts a tuple.

I'm happy to work on resolving that issue, but first wanted to clarify that it is necessary. Based on my perspective, there would be no reason to call this function to create a multi-dimensional array, but I surely don't understand the full interconnectivity of this package.

…ct column when given an extension scalar

pep8speaks · 2020-06-19T18:54:55Z

Hello @justinessert! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-07-10 22:53:42 UTC

TomAugspurger

Thanks. Can you add a release note to 1.1.0.rst under bug fixes?

pandas/core/dtypes/cast.py

pandas/tests/dtypes/cast/test_infer_dtype.py

Checking if extension dtype via built in function instead of manually Co-authored-by: Tom Augspurger <[email protected]>

justinessert · 2020-06-23T20:48:14Z

Thanks @TomAugspurger for all your time in reviewing this PR and the original issue. Everything looks to be passing and I think I have addressed all your comments, please lmk if you would like to see any other changes!

jreback

only a brief look
but this patch needs some work

too much special casing

pandas/core/dtypes/cast.py

pandas/core/frame.py

pandas/core/dtypes/cast.py

jorisvandenbossche

Thanks for working on this!

Is the change in the DataFrame constructor also tested?

pandas/core/dtypes/cast.py

pandas/core/frame.py

TomAugspurger

Just one comment on the test, but I think the change here look good.

pandas/tests/frame/test_constructors.py

justinessert · 2020-06-30T19:01:05Z

Can someone help me understand what this recent test failure is? I could be wrong, but it doesn't seem to be a unit test or formatting error

TomAugspurger · 2020-06-30T19:19:23Z

I restarted the job that crashed.

jorisvandenbossche

A few small comments, but generally looks good to me now!

pandas/tests/frame/methods/test_combine_first.py

pandas/tests/frame/test_constructors.py

doc/source/whatsnew/v1.1.0.rst

Co-authored-by: Joris Van den Bossche <[email protected]>

jorisvandenbossche · 2020-07-08T13:14:57Z

@jreback all good?

jreback

sorry thought i left these comments a while back

pandas/tests/frame/indexing/test_setitem.py

pandas/tests/frame/test_constructors.py

jreback · 2020-07-10T20:47:05Z

pandas/tests/frame/test_constructors.py

+            ),
+        ],
+    )
+    def test_constructor_period_data(self, data, dtype):


can you change the name -> test_constructor_scalar_data

@justinessert here.

jreback · 2020-07-10T20:52:06Z

pandas/core/frame.py

+                    value, len(self.index), infer_dtype
+                )
+            else:
+                value = cast_scalar_to_array(len(self.index), value)


I see the code and this is easily possible. please try to do as I have described. I don't see any reason not to.

jreback · 2020-07-10T22:45:06Z

ping on green.

jreback · 2020-07-11T02:03:05Z

thanks @justinessert

jorisvandenbossche · 2020-07-11T12:19:26Z

Thanks a lot @justinessert ! (and sorry for the difficult back and forth discussion)

justinessert · 2020-07-13T13:01:57Z

No problem!

@jorisvandenbossche @jreback @TomAugspurger Thanks for all your help in getting this PR through!

Bugfix to make DF.__setitem__ create extension column instead of obje…

0ec5911

…ct column when given an extension scalar

removed bad whitespace

9336955

TomAugspurger reviewed Jun 19, 2020

View reviewed changes

gfyoung added Bug ExtensionArray Extending pandas with custom dtypes or arrays. Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 20, 2020

justinessert and others added 13 commits June 22, 2020 10:17

Apply suggestions from code review

01fb076

Checking if extension dtype via built in function instead of manually Co-authored-by: Tom Augspurger <[email protected]>

added missing :

5c8b356

modified cast_extension_scalar_to_array test to include an Interval type

2c1f640

added user-facing test for extension type bug

d509bf4

fixed pep8 issues

e231bb1

added note about bug in setting series to scalar extension type

18ed043

corrected order of imports

a6b18f4

corrected order of imports

cbc29be

fixed black formatting errors

2f79822

removed extra comma

0f9178e

updated cast_scalar_to_arr to support tuple shape for extension dtype

bfa18fb

removed unneeded code

e7e9a48

added coverage for datetime with timezone in extension_array test

291eb2d

justinessert mentioned this pull request Jun 23, 2020

BUG: DataFrame construction from EA scalar gives object dtype #34959

Open

3 tasks

justinessert added 3 commits June 23, 2020 14:31

added TODO

3a788ed

correct line that was too long

38d7ce5

fixed dtype issue with tz test

a5e8df5

jreback requested changes Jun 23, 2020

View reviewed changes

pandas/core/dtypes/cast.py Outdated Show resolved Hide resolved

pandas/core/frame.py Outdated Show resolved Hide resolved

TomAugspurger reviewed Jun 24, 2020

View reviewed changes

pandas/core/frame.py Outdated Show resolved Hide resolved

pandas/core/dtypes/cast.py Outdated Show resolved Hide resolved

pandas/core/dtypes/cast.py Outdated Show resolved Hide resolved

justinessert added 2 commits June 24, 2020 11:21

creating distinct arrays for each column

5e439bd

resolving mypy error

6cc7959

jorisvandenbossche reviewed Jun 24, 2020

View reviewed changes

pandas/core/dtypes/cast.py Outdated Show resolved Hide resolved

pandas/core/frame.py Outdated Show resolved Hide resolved

created new test for period df construction

6495a36

TomAugspurger approved these changes Jun 30, 2020

View reviewed changes

pandas/tests/frame/test_constructors.py Outdated Show resolved Hide resolved

added assert_frame_equal to period_data test

42e7afa

jorisvandenbossche approved these changes Jul 6, 2020

View reviewed changes

pandas/tests/frame/methods/test_combine_first.py Show resolved Hide resolved

pandas/tests/frame/test_constructors.py Outdated Show resolved Hide resolved

doc/source/whatsnew/v1.1.0.rst Outdated Show resolved Hide resolved

justinessert and others added 4 commits July 7, 2020 14:42

Using pandas array instead of df constructor for better test

8343df3

Co-authored-by: Joris Van den Bossche <[email protected]>

changed wording

a50a42c

Merge branch 'master' of https://github.com/justinessert/pandas

3452c20

pylint fixes

6f3fb51

jorisvandenbossche approved these changes Jul 8, 2020

View reviewed changes

jreback requested changes Jul 8, 2020

View reviewed changes

pandas/tests/frame/indexing/test_setitem.py Outdated Show resolved Hide resolved

justinessert added 2 commits July 8, 2020 09:56

parameterized test and added comment

b95cdfc

removed extra comma

6830fde

jreback requested changes Jul 9, 2020

View reviewed changes

pandas/tests/frame/test_constructors.py Outdated Show resolved Hide resolved

justinessert added 2 commits July 10, 2020 10:13

Merge branch 'master' into master

6653ef8

parameterized test

c73a2de

justinessert requested a review from jreback July 10, 2020 17:03

jreback requested changes Jul 10, 2020

View reviewed changes

renamed test

100f334

jreback approved these changes Jul 10, 2020

View reviewed changes

jreback added this to the 1.1 milestone Jul 10, 2020

jreback merged commit 331093e into pandas-dev:master Jul 11, 2020

jreback mentioned this pull request Jul 11, 2020

REF: Dataframe.__init__ #35226

Closed

simonjayhawkins mentioned this pull request Sep 10, 2020

BUG: DataFrame.__setitem__ creates object-dtype array for extension type scalars #34832

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DF.setitem creates extension column when given extension scalar #34875

DF.setitem creates extension column when given extension scalar #34875

justinessert commented Jun 19, 2020 •

edited

Loading

pep8speaks commented Jun 19, 2020 •

edited

Loading

TomAugspurger left a comment

justinessert commented Jun 23, 2020

jreback left a comment

jorisvandenbossche left a comment

TomAugspurger left a comment

justinessert commented Jun 30, 2020

TomAugspurger commented Jun 30, 2020

jorisvandenbossche left a comment

jorisvandenbossche commented Jul 8, 2020

jreback left a comment

jreback Jul 10, 2020

jreback Jul 10, 2020

justinessert Jul 10, 2020

jreback Jul 10, 2020

jreback commented Jul 10, 2020

jreback commented Jul 11, 2020

jorisvandenbossche commented Jul 11, 2020

justinessert commented Jul 13, 2020

DF.__setitem__ creates extension column when given extension scalar #34875

DF.__setitem__ creates extension column when given extension scalar #34875

Conversation

justinessert commented Jun 19, 2020 • edited Loading

pep8speaks commented Jun 19, 2020 • edited Loading

Comment last updated at 2020-07-10 22:53:42 UTC

TomAugspurger left a comment

Choose a reason for hiding this comment

justinessert commented Jun 23, 2020

jreback left a comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment

Choose a reason for hiding this comment

TomAugspurger left a comment

Choose a reason for hiding this comment

justinessert commented Jun 30, 2020

TomAugspurger commented Jun 30, 2020

jorisvandenbossche left a comment

Choose a reason for hiding this comment

jorisvandenbossche commented Jul 8, 2020

jreback left a comment

Choose a reason for hiding this comment

jreback Jul 10, 2020

Choose a reason for hiding this comment

jreback Jul 10, 2020

Choose a reason for hiding this comment

justinessert Jul 10, 2020

Choose a reason for hiding this comment

jreback Jul 10, 2020

Choose a reason for hiding this comment

jreback commented Jul 10, 2020

jreback commented Jul 11, 2020

jorisvandenbossche commented Jul 11, 2020

justinessert commented Jul 13, 2020

DF.setitem creates extension column when given extension scalar #34875

DF.setitem creates extension column when given extension scalar #34875

justinessert commented Jun 19, 2020 •

edited

Loading

pep8speaks commented Jun 19, 2020 •

edited

Loading