Skip to content

Fix for #18178 and #18187 by changing the concat of empty RangeIndex #18191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 10, 2017

Conversation

SmokinCaterpillar
Copy link
Contributor

@SmokinCaterpillar SmokinCaterpillar commented Nov 9, 2017

The _concat_rangeindex_same_dtype function now keeps track of the last non-empty RangeIndex to extract the new stop value.

This fixes two issues with concatenating non-empty and empty DataFrames and Series.

Two regression tests were added as well.

Robert Meyer added 2 commits November 9, 2017 15:46
…of empty RangeIndex

The `_concat_rangeindex_same_dtype` now keeps track of the last non-empty RangeIndex to extract the new stop value.

This fixes two issues with concatenating non-empty and empty DataFrames and Series.

Two regression tests were added as well.
@SmokinCaterpillar
Copy link
Contributor Author

As an alternative solution, instead of using

if not len(obj):
    continue

and this new last_non_empty variable,
one could simply replace continue as follows:

if not len(obj):
    from pandas import Int64Index
    return _concat_index_same_dtype(indexes, klass=Int64Index)

However, in this case, the returned new index would always be changed to an integer one in case of a single empty RangeIndex in indexes. Accordingly, some already existing tests would need to be changed as well.

@jreback
Copy link
Contributor

jreback commented Nov 9, 2017

i am ok with the soln


for obj in indexes:
if not len(obj):
continue
# Remember the last non-empty index for the stop value
last_non_empty = obj
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe filter out he non empties first? then i think this loop should be ok?

@jreback jreback added Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 9, 2017
@SmokinCaterpillar
Copy link
Contributor Author

I made the code a bit nicer by first filtering the empty indexes with a list comprehension

@codecov
Copy link

codecov bot commented Nov 10, 2017

Codecov Report

Merging #18191 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18191      +/-   ##
==========================================
+ Coverage    91.4%    91.4%   +<.01%     
==========================================
  Files         163      163              
  Lines       50064    50065       +1     
==========================================
+ Hits        45763    45764       +1     
  Misses       4301     4301
Flag Coverage Δ
#multiple 89.21% <100%> (ø) ⬆️
#single 40.35% <33.33%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/concat.py 99.14% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 17e0b13...3082d0e. Read the comment docs.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Side note: the behaviour in case of all empties to take the start/stop/step feels a bit strange to me (why not the first? or convert to empty Int64Index?), but given that this is how it already was before, we can leave that for another PR (if we would want to change that)

@jorisvandenbossche jorisvandenbossche added this to the 0.21.1 milestone Nov 10, 2017
@SmokinCaterpillar
Copy link
Contributor Author

@jorisvandenbossche yes, another pull request for these changes is probably a good idea, especially because IMO this involves changing existing RangeIndex tests.

@jreback jreback merged commit 6b3641b into pandas-dev:master Nov 10, 2017
@jreback
Copy link
Contributor

jreback commented Nov 10, 2017

thanks @SmokinCaterpillar

yeah I find this logic slightly strange. If you'd like to follow up with a PR to clarify would be great!

@SmokinCaterpillar
Copy link
Contributor Author

Opened a follow up PR #18214 :-)

No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Dec 8, 2017
TomAugspurger pushed a commit that referenced this pull request Dec 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
4 participants