Skip to content

pandas.DataFrame.replace appears to replace multiple times #22834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
user3483203 opened this issue Sep 25, 2018 · 3 comments
Closed

pandas.DataFrame.replace appears to replace multiple times #22834

user3483203 opened this issue Sep 25, 2018 · 3 comments
Labels
Duplicate Report Duplicate issue or pull request

Comments

@user3483203
Copy link

user3483203 commented Sep 25, 2018

Here's an easy to produce example:

df = pd.DataFrame({'A': ['a', 'a', 'b']})
dct = {'a': 'b', 'b': 'hello'}

Problem description

When I use df.A.replace(dct), it seems as though the replacement is being applied recursively, and updating until no matches can be found. Here is the output I am currently getting:

>>> df.A.replace(dct)
0    hello
1    hello
2    hello
Name: A, dtype: object

As you can see it seems like a is replaced by b is replaced by hello, instead of the single replacement I would expect. I looked through the documentation for replace, and saw nothing that claimed this was intended.

Expected Output

I would expect the output to be the same as df.A.map(dct)

0        b
1        b
2    hello
Name: A, dtype: object

Output of pd.show_versions()


INSTALLED VERSIONS
------------------
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.4
pytest: 2.9.2
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.15.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 0.9.3
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.8
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

</details>

@WillAyd
Copy link
Member

WillAyd commented Sep 26, 2018

Hmm can you try this on master? I am not able to reproduce there and got your expected output

@WillAyd WillAyd added the Needs Info Clarification about behavior needed to assess issue label Sep 26, 2018
@user3483203
Copy link
Author

Can confirm it works as expected on master. I believe it may be the same issue as #20656, which I didn't notice before. If so, this issue should be closed.

@WillAyd WillAyd added Duplicate Report Duplicate issue or pull request and removed Needs Info Clarification about behavior needed to assess issue labels Sep 26, 2018
@WillAyd
Copy link
Member

WillAyd commented Sep 26, 2018

Thanks for cross-referencing. That does look to be the case so closing as duplicate

@WillAyd WillAyd closed this as completed Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants