Surprising behavior of DataFrame.replace #3582

cpcloud · 2013-05-11T22:07:43Z

This seems a bit surprising:

>>> from string import ascii_lowercase as letters
>>> from pandas import DataFrame
>>> df = DataFrame(list(letters[:4]), columns=['a'])
>>> df
   a
0  a
1  b
2  c
3  d
>>> df.replace({'a': 'b'})
   a
0  a
1  a
2  c
3  d
>>> df.replace({'a': 'c'})
   a
0  a
1  b
2  b
3  d

Does this have to do with padding?

jreback · 2013-05-11T22:09:45Z

could be a bug

cpcloud · 2013-05-11T22:10:28Z

also no doc for the first parameter of this method, will submit a pr for that...

jreback · 2013-05-11T22:17:57Z

also if u have time:

I think we need clarification in the main docs on usage of when to use

filter, replace, select, update, lookup

I also don't think there is an example of filter anywhere

cpcloud · 2013-05-11T22:19:41Z

np, not sure if will be done 2day, trying to work out the regex replace on frames and just wanted to note this strangeness when i was looking at one of the simplest possible cases of replacement in an object block

cpcloud · 2013-05-11T23:42:19Z

@jreback A quick glance at indexing.rst shows docs for select and lookup (they could be fleshed out a bit), but nothing for the other 3. Will add the 3 and flesh the other 2 out.

jreback · 2013-05-12T00:25:34Z

@cpcloud I think replace is somewhere
a complete example

cpcloud · 2013-05-12T00:40:27Z

ah yes, grin '\.replace' doc shows it

jankatins · 2013-05-12T11:52:28Z

Wow, this was worrying, but fortunately df.a.replace({'a': 'b'}) and df.replace({"a":{"a":"b"}}) works as intended...

cpcloud · 2013-05-12T12:29:22Z

@jreback seems the magic is already in _interpolate, I will leave it out in replace though.

cpcloud · 2013-05-12T19:40:40Z

@jreback The "bug" here is that when you call

df.replace({'a': 'b'})

the column 'a' replaces the value at index 'b' with the previous non-null value because the fill method default is pad. This seems a bit weird. Can fix if u want. I would change to "search the whole frame for occurrences of 'a' and replace with 'b'.

jreback · 2013-05-12T19:43:44Z

you want this equivalent to

df.replace('a','b')

?

cpcloud · 2013-05-12T19:44:04Z

yes, is that ok?

jreback · 2013-05-12T19:46:05Z

let me take a look a little later

was always fuzzy on this anyhow

are there tests for this usage anyhow ?
do they break?

cpcloud · 2013-05-12T19:56:46Z

There are tests. However, the replace method doesn't actually do anything so they don't break. Here's a pared down example from the test_replace_interpolate method of TestDataFrame from pandas/tests/test_frame.py:

# get the test frame
from pandas.tests.test_frame import _tsframe as df
from numpy.testing import assert_array_equal
res = df.replace({'A': nan}, method='pad', axis=1)
assert_array_equal(df, res)  # replace == identity function here

jreback · 2013-05-13T03:15:40Z

I think it makes sense to not have automatic interpolation by default

so, if say df.replace({ 'a' : 'b' }) would be translated to df.replace('a','b'), IOW, only allow None for the value if to_replace is a dict/Series (right now this will interpolate)

then make _interpolate an explicit method interpolate. Even though this is an API change, I doubt this is expected behavor

@wesm, @y-p any thoughts?

@cpcloud why don't you put up the PR when you are ready (as this would be an easy change anyhow)

cpcloud · 2013-05-13T03:19:02Z

okay. kicking tires a bit right now (added 12 new tests so far); adding the regex functionality is proving to be quite a bit more involved than i thought, but i have a much better understanding (although not complete) of DataFrame internals.

cpcloud · 2013-05-14T00:45:33Z

@jreback @y-p Can I address this issue in the regex pr? I think it will create fewer merge conflicts if I just do this all at once.

jreback · 2013-05-14T00:52:59Z

yes, I would

cpcloud · 2013-05-14T01:49:37Z

just realized the axis parameter in replace never gets used if the call to _interpolate is nixed...remove the parameter?

jreback · 2013-05-17T19:26:26Z

@cploud close this too?

cpcloud · 2013-05-17T19:59:46Z

Not just yet. Need to submit 0.12 pr.

cpcloud mentioned this issue May 13, 2013

ENH: add regex functionality to DataFrame.replace #3584

Merged

This was referenced May 21, 2013

API: deprecate DataFrame.interpolate #3675

Merged

API: DataFrame.interpolate will be deprecated in v0.13 #3676

Closed

jreback closed this as completed in #3675 May 22, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surprising behavior of DataFrame.replace #3582

Surprising behavior of DataFrame.replace #3582

cpcloud commented May 11, 2013

jreback commented May 11, 2013

cpcloud commented May 11, 2013

jreback commented May 11, 2013

cpcloud commented May 11, 2013

cpcloud commented May 11, 2013

jreback commented May 12, 2013

cpcloud commented May 12, 2013

jankatins commented May 12, 2013

cpcloud commented May 12, 2013

cpcloud commented May 12, 2013

jreback commented May 12, 2013

cpcloud commented May 12, 2013

jreback commented May 12, 2013

cpcloud commented May 12, 2013

jreback commented May 13, 2013

cpcloud commented May 13, 2013

cpcloud commented May 14, 2013

jreback commented May 14, 2013

cpcloud commented May 14, 2013

jreback commented May 17, 2013

cpcloud commented May 17, 2013

Surprising behavior of DataFrame.replace #3582

Surprising behavior of DataFrame.replace #3582

Comments

cpcloud commented May 11, 2013

jreback commented May 11, 2013

cpcloud commented May 11, 2013

jreback commented May 11, 2013

cpcloud commented May 11, 2013

cpcloud commented May 11, 2013

jreback commented May 12, 2013

cpcloud commented May 12, 2013

jankatins commented May 12, 2013

cpcloud commented May 12, 2013

cpcloud commented May 12, 2013

jreback commented May 12, 2013

cpcloud commented May 12, 2013

jreback commented May 12, 2013

cpcloud commented May 12, 2013

jreback commented May 13, 2013

cpcloud commented May 13, 2013

cpcloud commented May 14, 2013

jreback commented May 14, 2013

cpcloud commented May 14, 2013

jreback commented May 17, 2013

cpcloud commented May 17, 2013