Skip to content

Docs: Does pandas .apply() twice on first row, or not? #28827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eric-downes opened this issue Oct 7, 2019 · 4 comments · Fixed by #28854
Closed

Docs: Does pandas .apply() twice on first row, or not? #28827

eric-downes opened this issue Oct 7, 2019 · 4 comments · Fixed by #28854
Labels
Apply Apply, Aggregate, Transform, Map Docs good first issue

Comments

@eric-downes
Copy link

eric-downes commented Oct 7, 2019

Code Sample

import pandas as pd
df = pd.DataFrame({'a':list('abcdef')})                                                   
L = []
def fcn(x): 
    global L
    L.append(x) 
df.a.apply(fcn)
print(L)
# ['a', 'b', 'c', 'd', 'e', 'f']

L=[]
df.apply(lambda x: fcn(x.a), axis=1) 
print(L)
# ['a', 'b', 'c', 'd', 'e', 'f']

Problem description

According to the current pd.DataFrame.apply() docs: "In the current implementation apply calls func twice on the first column/row to decide whether it can take a fast or slow code path. This can lead to unexpected behavior if func has side-effects, as they will take effect twice for the first column/row."

Expected Output

['a', 'a', 'b', 'c', 'd', 'e', 'f']
['a', 'a', 'b', 'c', 'd', 'e', 'f']

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: None
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.4
scipy: 1.3.0
pyarrow: 0.11.1
xarray: None
IPython: 7.5.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.0
bs4: None
html5lib: None
sqlalchemy: 1.2.5
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@mroeschke
Copy link
Member

The implementation was changed recently, but it looks like the docs we not changed. Interested in submitting a PR?

@mroeschke mroeschke added Apply Apply, Aggregate, Transform, Map Docs good first issue labels Oct 7, 2019
@Aakash1822
Copy link

Aakash1822 commented Oct 7, 2019

Yes, I am interested in to update the docs of this pd.Dataframe.apply() @mroeschke

@Aakash1822
Copy link

Please tell me the folder name to change this issue

@mroeschke
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Docs good first issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants