Skip to content

BUG: series replace should allow a single list #4748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 6, 2013
Merged

BUG: series replace should allow a single list #4748

merged 1 commit into from
Sep 6, 2013

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Sep 4, 2013

Also reverts the method parameter of NDFrame.replace() deprecation

closes #4743

@ghost ghost assigned cpcloud Sep 4, 2013
_fill_methods = {'pad': com.pad_1d, 'backfill': com.backfill_1d}


def _get_fill_func(method):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this exists somewhere already in common?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. i can't find it after grining for method and backfill, i don't see anything similar...this was in the 0.12 release but not sure when it was removed...should i move to common.py?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would put it in common ; i will look for this too

@cpcloud
Copy link
Member Author

cpcloud commented Sep 5, 2013

@jreback this ok 2 merge? or did u find a _get_fill_func-type function elsewhere?

@jreback
Copy link
Contributor

jreback commented Sep 5, 2013

I don't think this will work if you pass odd dtypes eg 'a' to an int series or something ; maybe put some tests that change dtype on purpose

@jreback
Copy link
Contributor

jreback commented Sep 6, 2013

fyi...may want to move some of your tests to test_generic.py

@cpcloud
Copy link
Member Author

cpcloud commented Sep 6, 2013

is it ok if i move in a separate pr?

@cpcloud
Copy link
Member Author

cpcloud commented Sep 6, 2013

re strange dtypes: what behavior do u want? if you pass a replacement for a value that doesn't exist, then the original series is returned. e.g.,

In [16]: s = Series(arange(5.))

In [17]: s.replace({'a': 1, 'b': 1})
Out[17]:
0    0
1    1
2    2
3    3
4    4
dtype: float64

In [18]: s.replace(['a','b'])
Out[18]:
0    0
1    1
2    2
3    3
4    4
dtype: float64

but mixing things works

In [19]: s.replace(['a',1])
Out[19]:
0    0
1    0
2    2
3    3
4    4
dtype: float64

@jreback
Copy link
Contributor

jreback commented Sep 6, 2013

sep PR for tests ok....

what I mean is, just some confirming tests....

In [1]: s = Series(np.arange(5))

In [2]: s.replace?

In [3]: s.replace([3],[3.])
Out[3]: 
0    0
1    1
2    2
3    3
4    4
dtype: int64

In [4]: s.replace([3],[3.5])
Out[4]: 
0    0.0
1    1.0
2    2.0
3    3.5
4    4.0
dtype: float64

In [5]: s.replace([3,4],[3.5,'a'])
Out[5]: 
0      0
1      1
2      2
3    3.5
4      a
dtype: object

In [6]: s.replace([3,4],[3.5,Timestamp('20130101')])
Out[6]: 
0                      0
1                      1
2                      2
3                    3.5
4    2013-01-01 00:00:00
dtype: object

In [7]: s.replace([3,4],[3.5,True])
Out[7]: 
0    0.0
1    1.0
2    2.0
3    3.5
4    1.0
dtype: float64

note the last one; I am not sure we can do much about that (well we can but only if you use Block.setitem, where I attempt to see if you are using a bool and do the right thing)....but too much work really

@cpcloud
Copy link
Member Author

cpcloud commented Sep 6, 2013

ah ok

@cpcloud
Copy link
Member Author

cpcloud commented Sep 6, 2013

@jreback this ok?

@jreback
Copy link
Contributor

jreback commented Sep 6, 2013

I guess I don't understand how this works

ser = Series([0, 1, 2, 3, 4])
result = ser.replace([1,2,3])
assert_series_equal(result, Series([0,0,0,0,4]))

so the 1,2,3 are all replaced by 0? (shouldn't it be nan)?

@cpcloud
Copy link
Member Author

cpcloud commented Sep 6, 2013

It's treating 1, 2, and 3 as "missing" and replacing them by the nearest valid value (this is what the pad_1d function does)

@jreback
Copy link
Contributor

jreback commented Sep 6, 2013

ic....so tantamount to s[[1,2,3]] = np.nan; s.ffill()

@jreback
Copy link
Contributor

jreback commented Sep 6, 2013

ok then

@cpcloud
Copy link
Member Author

cpcloud commented Sep 6, 2013

Yes. In fact, that's what I was doing, but then dtype is not preserved (in cases where it can be)

@jreback
Copy link
Contributor

jreback commented Sep 6, 2013

cool...go ahead then

This requires adding the ``method`` keyword parameter back in.
cpcloud added a commit that referenced this pull request Sep 6, 2013
BUG: series replace should allow a single list
@cpcloud cpcloud merged commit 84ca068 into pandas-dev:master Sep 6, 2013
@cpcloud cpcloud deleted the series-replace-with-list branch September 6, 2013 23:40
@jreback
Copy link
Contributor

jreback commented Sep 8, 2013

in test_series.py

       def test_replace_mixed_types(self):
(Pdb) l
4697            s = Series(np.arange(5))
4698 

This fails on 32-bit because s is int32 (and expected is int64)
use s = Series(np.arange(5),dtype='int64') here.....

as the numpy arrays pass thru the dtype :)

I am just going to fix it in a PR I am doing. FYI

@cpcloud
Copy link
Member Author

cpcloud commented Sep 8, 2013

ah ok ... cool thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in Series.replace
2 participants