Skip to content

BUG: GH14233 resample().median() failed if duplicate column names wer… #15202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

Dr-Irv
Copy link
Contributor

@Dr-Irv Dr-Irv commented Jan 23, 2017

Simple fix for median issue. Should use cython implementation.

@@ -2203,7 +2203,6 @@ def agg_series(self, obj, func):
# cython aggregation

_cython_functions = copy.deepcopy(BaseGrouper._cython_functions)
_cython_functions['aggregate'].pop('median')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I remember something was failing because of this.......

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran nosetests locally, and it works. I looked back at old versions, and I think it is a legacy issue of how the cython stuff was implemented over time.

def test_median_duplicate_columns(self):
# GH 14233

df = pd.DataFrame(np.array([[i + j for i in range(20)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the repro from the original issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only difference is using np.random.randn() versus the sequence I put in there, which makes it easier to debug. I can put the randn() if you want.

@jreback jreback added Bug Resample resample method labels Jan 23, 2017
@jreback jreback added this to the 0.20.0 milestone Jan 24, 2017
@jreback
Copy link
Contributor

jreback commented Jan 24, 2017

lgtm. ping on green.

@jreback jreback closed this in 84bc3b2 Jan 24, 2017
@jreback
Copy link
Contributor

jreback commented Jan 24, 2017

thanks!

@Dr-Irv Dr-Irv deleted the Issue14233 branch February 28, 2017 15:43
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
Simple fix for  median issue.  Should use cython implementation.

closes pandas-dev#14233

Author: Dr-Irv <[email protected]>

Closes pandas-dev#15202 from Dr-Irv/Issue14233 and squashes the following commits:

6e0d900 [Dr-Irv] Use randn in test
1a3b4aa [Dr-Irv] BUG: GH14233 resample().median() failed if duplicate column names were present
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Resample resample method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: KeyError from resample().median() with duplicate column names
2 participants