[#16737] Index type for Series with empty data #32053

SaturnFromTitan · 2020-02-17T12:08:01Z

closes Unexpected results creating an empty Series #16737
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

I picked up all the notes from #16737 where it was suggested to use Index over RangeIndex for empty data.

pandas/tests/series/test_api.py

pandas/tests/extension/base/missing.py

pandas/tests/series/test_api.py

pandas/core/series.py

SaturnFromTitan · 2020-02-20T20:40:17Z

Can you please take another look @WillAyd. I think I addressed all your comments

WillAyd · 2020-02-20T22:18:59Z

@SaturnFromTitan maybe missed the comment but did we actually decide that this is the right approach? I think it's actually confusing for the index types to differ:

Here is current behaviour on master

# series
>>> pd.Series().index
Index([], dtype='object')
>>> pd.Series([]).index
RangeIndex(start=0, stop=0, step=1)
>>> pd.Series([1]).index
RangeIndex(start=0, stop=1, step=1)

# frame
>>> pd.DataFrame().index
Index([], dtype='object')
>>> pd.DataFrame([[]]).index
RangeIndex(start=0, stop=1, step=1)
>>> pd.DataFrame([[1]]).index
RangeIndex(start=0, stop=1, step=1)

Why would we only want to change the second Series case shown above?

SaturnFromTitan · 2020-02-21T17:24:04Z

@WillAyd I went that direction as it was suggested in the comments of #16961:

…-series

SaturnFromTitan · 2020-02-23T13:11:05Z

I see your point though @WillAyd and agree it could lead to more confusion merging the current changes. Do you think we should...

keep the current master behaviour
adjust the default index types for pd.Series() and pd.DataFrame() to use a RangeIndex as well

Not sure about all the details, but option 2 seems to be the most consistent.

jreback

this is actually a non-trivial change, but I am not sure we can easily deprecate this.

@jorisvandenbossche @TomAugspurger

pandas/tests/extension/decimal/test_decimal.py

pandas/tests/series/test_constructors.py

jreback · 2020-02-23T16:17:50Z

@SaturnFromTitan also you should add the expansion test as indicated in the OP

TomAugspurger

I don't think several of your changes to the tests are correct. If we're still getting RangeIndexes there, then we'll need to update the code rather than the tests.

pandas/tests/extension/base/missing.py

SaturnFromTitan · 2020-02-28T15:00:14Z

Before I continue to work on this, can we please clarify if this should be changed in the first place or not?

And if it should change, then is my current implementation is the way to go or should I rather adjust the index of pd.Series()? See WillAyds comment and my reply.

I'd also be fine with closing this issue+PR as a "Won't change". It seems like any solution could yield something that's surprising to some users and expected to others.

SaturnFromTitan · 2020-02-28T15:00:52Z

forgot the pings: @jreback @TomAugspurger @WillAyd

TomAugspurger · 2020-02-28T16:57:39Z

IMO, the default index for pd.Series([]) should match pd.Index([]), an empty object-dtype index.

jorisvandenbossche · 2020-04-01T07:36:52Z

I agree with @TomAugspurger that the default index for pd.Series([]) should match pd.Index([]), an empty object-dtype index.

We already do a deprecation warning for pd.Series([]) becoming object dtype in the future, I suppose we can add a second warning for the Index also becoming an object index in the future? Or would that be too annoying?

(@SaturnFromTitan I reopened the PR not to mean that you should continue working on this (although it is certainly welcome of course, once there is agreement on the behaviour), but that we should have something open with this discussion. But so if not, we should move the discussion to an issue)

TomAugspurger · 2020-04-01T11:34:16Z

I'd be OK with two warnings from `pd.Series([])` if it meant getting things consistent.

…

On Wed, Apr 1, 2020 at 2:37 AM Joris Van den Bossche < ***@***.***> wrote: I agree with @TomAugspurger <https://github.com/TomAugspurger> that the default index for pd.Series([]) should match pd.Index([]), an empty object-dtype index. We already do a deprecation warning for pd.Series([]) becoming object dtype in the future, I suppose we can add a second warning for the Index also becoming an object index in the future? Or would that be too annoying? ***@***.*** <https://github.com/SaturnFromTitan> I reopened the PR not to mean that you should continue working on this (although it is certainly welcome of course, once there is agreement on the behaviour), but that we should have something open with this discussion. But so if not, we should move the discussion to an issue) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32053 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKAOIRTEHNF6IVLY25LYLTRKLVKHANCNFSM4KWQLKGA> .

doc/source/whatsnew/v1.1.0.rst

pandas/core/indexes/base.py

SaturnFromTitan · 2020-04-20T08:36:10Z

Getting back at this, thanks for keeping it alive @jorisvandenbossche. I'll change the code to add the warning.

I'll take the same approach as in the dtype change of Series([]): First DeprecationWarning, later FutureWarning, then code change.

…-series

SaturnFromTitan · 2020-04-20T09:24:12Z

I still have to fix a ton of DeprecationWarnings in the test suite. I'll take care of that in the coming days.

pep8speaks · 2020-04-20T10:09:58Z

Hello @SaturnFromTitan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-04-22 08:02:50 UTC

…-series

SaturnFromTitan · 2020-04-22T21:33:46Z

Please have another look @jreback @TomAugspurger @jorisvandenbossche

WillAyd · 2020-07-29T20:48:27Z

@SaturnFromTitan looks like a lot of merge conflicts have piled up - can you resolve?

WillAyd · 2020-09-10T18:57:11Z

Closing as I think stale but ping @SaturnFromTitan if you'd like to pick back up and can address merge conflicts

SaturnFromTitan commented Feb 17, 2020

View reviewed changes

pandas/tests/series/test_api.py Outdated Show resolved Hide resolved

SaturnFromTitan force-pushed the 16737-unexpected-results-creating-an-empty-series branch from 738da11 to 3704069 Compare February 17, 2020 18:56

SaturnFromTitan changed the title ~~16737 Consistent Index type on empty data~~ 16737 Consistent Index type for Series with empty data Feb 17, 2020

SaturnFromTitan changed the title ~~16737 Consistent Index type for Series with empty data~~ [#16737] Index type for Series with empty data Feb 17, 2020

SaturnFromTitan force-pushed the 16737-unexpected-results-creating-an-empty-series branch 3 times, most recently from 7535447 to 20b5ebb Compare February 17, 2020 21:07

found a first working solution

4c8d1ea

SaturnFromTitan force-pushed the 16737-unexpected-results-creating-an-empty-series branch from 20b5ebb to 4c8d1ea Compare February 18, 2020 21:59

SaturnFromTitan requested a review from jreback February 18, 2020 22:32

WillAyd requested changes Feb 19, 2020

View reviewed changes

pandas/tests/extension/base/missing.py Outdated Show resolved Hide resolved

pandas/tests/series/test_api.py Outdated Show resolved Hide resolved

pandas/core/series.py Outdated Show resolved Hide resolved

SaturnFromTitan added 2 commits February 20, 2020 12:54

review comments

7a63ac9

fixing tests

799143c

refactorings

e4b880c

Merge branch 'master' into 16737-unexpected-results-creating-an-empty…

3a84574

…-series

jreback requested changes Feb 23, 2020

View reviewed changes

pandas/tests/extension/decimal/test_decimal.py Outdated Show resolved Hide resolved

pandas/tests/series/test_constructors.py Outdated Show resolved Hide resolved

TomAugspurger reviewed Feb 25, 2020

View reviewed changes

pandas/tests/extension/base/missing.py Show resolved Hide resolved

SaturnFromTitan closed this Mar 31, 2020

jorisvandenbossche reopened this Apr 1, 2020

jreback requested changes Apr 10, 2020

View reviewed changes

doc/source/whatsnew/v1.1.0.rst Outdated Show resolved Hide resolved

pandas/core/indexes/base.py Outdated Show resolved Hide resolved

jreback added Deprecate Functionality to remove in pandas Dtype Conversions Unexpected or buggy dtype conversions Index Related to the Index class or subclasses labels Apr 10, 2020

SaturnFromTitan added 4 commits April 20, 2020 10:43

Merge branch 'master' into 16737-unexpected-results-creating-an-empty…

0528123

…-series

updated whatsnew

863916f

added DeprecationWarning instead of code change

6c6a7c8

undid the changed tests

b924865

SaturnFromTitan force-pushed the 16737-unexpected-results-creating-an-empty-series branch from d6167ff to b924865 Compare April 20, 2020 09:13

silenced warnings in pandas/tests/series

70708b3

SaturnFromTitan force-pushed the 16737-unexpected-results-creating-an-empty-series branch from b85bb42 to 70708b3 Compare April 20, 2020 10:15

SaturnFromTitan added 7 commits April 20, 2020 12:54

fixed linting

a1f9068

resolving broken tests

eaeb29d

fixed data check in create_series_with_explicit_index + renmae of helper

2a20a2f

fixed remaining broken test

c9f2bd1

silenced two warnings in the docs

2109012

Merge branch 'master' into 16737-unexpected-results-creating-an-empty…

f4acbf9

…-series

silenced many warnings in test suite

96556cd

SaturnFromTitan force-pushed the 16737-unexpected-results-creating-an-empty-series branch from 92c15ae to 96556cd Compare April 22, 2020 10:13

SaturnFromTitan added 2 commits April 22, 2020 16:22

fixed tests

b1c08a6

silenced some more warnings

6d30bce

SaturnFromTitan requested review from TomAugspurger, jreback and WillAyd April 22, 2020 21:32

WillAyd closed this Sep 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#16737] Index type for Series with empty data #32053

[#16737] Index type for Series with empty data #32053

SaturnFromTitan commented Feb 17, 2020 •

edited

Loading

SaturnFromTitan commented Feb 20, 2020

WillAyd commented Feb 20, 2020

SaturnFromTitan commented Feb 21, 2020

SaturnFromTitan commented Feb 23, 2020 •

edited

Loading

jreback left a comment

jreback commented Feb 23, 2020

TomAugspurger left a comment

SaturnFromTitan commented Feb 28, 2020

SaturnFromTitan commented Feb 28, 2020

TomAugspurger commented Feb 28, 2020

jorisvandenbossche commented Apr 1, 2020

TomAugspurger commented Apr 1, 2020 via email

SaturnFromTitan commented Apr 20, 2020

SaturnFromTitan commented Apr 20, 2020 •

edited

Loading

pep8speaks commented Apr 20, 2020 •

edited

Loading

SaturnFromTitan commented Apr 22, 2020

WillAyd commented Jul 29, 2020

WillAyd commented Sep 10, 2020

[#16737] Index type for Series with empty data #32053

[#16737] Index type for Series with empty data #32053

Conversation

SaturnFromTitan commented Feb 17, 2020 • edited Loading

SaturnFromTitan commented Feb 20, 2020

WillAyd commented Feb 20, 2020

SaturnFromTitan commented Feb 21, 2020

SaturnFromTitan commented Feb 23, 2020 • edited Loading

jreback left a comment

Choose a reason for hiding this comment

jreback commented Feb 23, 2020

TomAugspurger left a comment

Choose a reason for hiding this comment

SaturnFromTitan commented Feb 28, 2020

SaturnFromTitan commented Feb 28, 2020

TomAugspurger commented Feb 28, 2020

jorisvandenbossche commented Apr 1, 2020

TomAugspurger commented Apr 1, 2020 via email

SaturnFromTitan commented Apr 20, 2020

SaturnFromTitan commented Apr 20, 2020 • edited Loading

pep8speaks commented Apr 20, 2020 • edited Loading

Comment last updated at 2020-04-22 08:02:50 UTC

SaturnFromTitan commented Apr 22, 2020

WillAyd commented Jul 29, 2020

WillAyd commented Sep 10, 2020

SaturnFromTitan commented Feb 17, 2020 •

edited

Loading

SaturnFromTitan commented Feb 23, 2020 •

edited

Loading

SaturnFromTitan commented Apr 20, 2020 •

edited

Loading

pep8speaks commented Apr 20, 2020 •

edited

Loading