-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
[#16737] Index type for Series with empty data #32053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#16737] Index type for Series with empty data #32053
Conversation
738da11
to
3704069
Compare
7535447
to
20b5ebb
Compare
20b5ebb
to
4c8d1ea
Compare
Can you please take another look @WillAyd. I think I addressed all your comments |
@SaturnFromTitan maybe missed the comment but did we actually decide that this is the right approach? I think it's actually confusing for the index types to differ: Here is current behaviour on master # series
>>> pd.Series().index
Index([], dtype='object')
>>> pd.Series([]).index
RangeIndex(start=0, stop=0, step=1)
>>> pd.Series([1]).index
RangeIndex(start=0, stop=1, step=1)
# frame
>>> pd.DataFrame().index
Index([], dtype='object')
>>> pd.DataFrame([[]]).index
RangeIndex(start=0, stop=1, step=1)
>>> pd.DataFrame([[1]]).index
RangeIndex(start=0, stop=1, step=1) Why would we only want to change the second Series case shown above? |
I see your point though @WillAyd and agree it could lead to more confusion merging the current changes. Do you think we should...
Not sure about all the details, but option 2 seems to be the most consistent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is actually a non-trivial change, but I am not sure we can easily deprecate this.
@SaturnFromTitan also you should add the expansion test as indicated in the OP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think several of your changes to the tests are correct. If we're still getting RangeIndexes there, then we'll need to update the code rather than the tests.
Before I continue to work on this, can we please clarify if this should be changed in the first place or not? And if it should change, then is my current implementation is the way to go or should I rather adjust the index of I'd also be fine with closing this issue+PR as a "Won't change". It seems like any solution could yield something that's surprising to some users and expected to others. |
forgot the pings: @jreback @TomAugspurger @WillAyd |
IMO, the default index for |
I agree with @TomAugspurger that the default index for We already do a deprecation warning for (@SaturnFromTitan I reopened the PR not to mean that you should continue working on this (although it is certainly welcome of course, once there is agreement on the behaviour), but that we should have something open with this discussion. But so if not, we should move the discussion to an issue) |
I'd be OK with two warnings from `pd.Series([])` if it meant getting things
consistent.
…On Wed, Apr 1, 2020 at 2:37 AM Joris Van den Bossche < ***@***.***> wrote:
I agree with @TomAugspurger <https://github.com/TomAugspurger> that the
default index for pd.Series([]) should match pd.Index([]), an empty
object-dtype index.
We already do a deprecation warning for pd.Series([]) becoming object
dtype in the future, I suppose we can add a second warning for the Index
also becoming an object index in the future? Or would that be too annoying?
***@***.*** <https://github.com/SaturnFromTitan> I reopened the PR
not to mean that you should continue working on this (although it is
certainly welcome of course, once there is agreement on the behaviour), but
that we should have something open with this discussion. But so if not, we
should move the discussion to an issue)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#32053 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIRTEHNF6IVLY25LYLTRKLVKHANCNFSM4KWQLKGA>
.
|
Getting back at this, thanks for keeping it alive @jorisvandenbossche. I'll change the code to add the warning. I'll take the same approach as in the dtype change of |
d6167ff
to
b924865
Compare
I still have to fix a ton of DeprecationWarnings in the test suite. I'll take care of that in the coming days. |
Hello @SaturnFromTitan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-04-22 08:02:50 UTC |
b85bb42
to
70708b3
Compare
92c15ae
to
96556cd
Compare
Please have another look @jreback @TomAugspurger @jorisvandenbossche |
@SaturnFromTitan looks like a lot of merge conflicts have piled up - can you resolve? |
Closing as I think stale but ping @SaturnFromTitan if you'd like to pick back up and can address merge conflicts |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
I picked up all the notes from #16737 where it was suggested to use
Index
overRangeIndex
for empty data.