Skip to content

bug in replace? #7376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jseabold opened this issue Jun 6, 2014 · 8 comments · Fixed by #7379
Closed

bug in replace? #7376

jseabold opened this issue Jun 6, 2014 · 8 comments · Fixed by #7379
Labels
Milestone

Comments

@jseabold
Copy link
Contributor

jseabold commented Jun 6, 2014

I think this is distinct from #5541 though I agree that an API change here might be helpful. AFAICT, I am using replace as 'expected', but all of the keys aren't being matched. Any idea what's going on?

d = {u'fname' :
    {'out_augmented_AUG_2011.json' : pd.Period(year=2011, month=8, freq='M'), 
    'out_augmented_JAN_2011.json' : pd.Period(year=2011, month=1, freq='M'),
    'out_augmented_MAY_2012.json' : pd.Period(year=2012, month=5, freq='M'),
    'out_augmented_SUBSIDY_WEEK.json' : pd.Period(year=2011, month=4, freq='M'),
    'out_augmented_AUG_2012.json' : pd.Period(year=2012, month=8, freq='M'),
    'out_augmented_MAY_2011.json' : pd.Period(year=2011, month=5, freq='M'),
    'out_augmented_SEP_2013.json' : pd.Period(year=2013, month=9, freq='M'),
    }}

df = pd.DataFrame(['out_augmented_AUG_2012.json',
                'out_augmented_SEP_2013.json',
                'out_augmented_SUBSIDY_WEEK.json',
                'out_augmented_MAY_2012.json',
                'out_augmented_MAY_2011.json',
                'out_augmented_AUG_2011.json',
                'out_augmented_JAN_2011.json'], columns=['fname'])

df.replace(d)

On

[31]: pd.version.version
[31]: '0.13.1-753-g4614ac8'
@cpcloud cpcloud self-assigned this Jun 6, 2014
@cpcloud cpcloud added this to the 0.14.1 milestone Jun 6, 2014
@jreback jreback added the Bug label Jun 6, 2014
@cpcloud
Copy link
Member

cpcloud commented Jun 6, 2014

this is a dtype issue ... working on it

@cpcloud
Copy link
Member

cpcloud commented Jun 6, 2014

somewhat subtle issue here:

this is happening because Period raises on period == not_a_period rather than returning NotImplemented (as it should). Python itself then does essentially id(x) == id(y) and this returns False. Numpy return False if any error is thrown instead of reraising

@cpcloud
Copy link
Member

cpcloud commented Jun 6, 2014

so what happens then is that in com.mask_missing we get a scalar bool False instead of the vectorized __eq__ when we do np.array(['the_string']) == np.array([Period(...), 'a', 'b', 'c']) and so we keep getting false for every mask comparison thus the only replaced value is the first

@cpcloud
Copy link
Member

cpcloud commented Jun 6, 2014

the problem is that making it conform the proper python semantics is a breaking API change which @jseabold i know is very fond of. e.g., if you were catching a TypeError before, now you'll get a boolean every time no matter what (unless the other's methods are overridden to do something nefarious like raise)

@cpcloud
Copy link
Member

cpcloud commented Jun 6, 2014

there was only a single test for this behavior and that was that it was raising a TypeError on comparison to non periods

@cpcloud
Copy link
Member

cpcloud commented Jun 6, 2014

@jseabold can you give this a spin when you get a chance? thx

@jseabold
Copy link
Contributor Author

jseabold commented Jun 6, 2014

Link?

@cpcloud
Copy link
Member

cpcloud commented Jun 6, 2014

oh duh i haven't put up the PR yet 😑

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants