QST: is the new behavior of df.apply(my_func, axis=1) in v1.1.0 intended? #35483

manihamidi · 2020-07-30T22:10:51Z

I have searched the [pandas] tag on StackOverflow for similar questions.
I have asked my usage related question on StackOverflow.

Question about pandas

import pandas as pd
def test_func(row):
    row['c'] = str(row['a']) + str(row['b'])
    row['d'] = row['a'] + 1
    return row

df = pd.DataFrame({'a': [1,2,3], 'b': ['i','j', 'k']})
df.apply(test_func, axis=1)

The above code ran on pandas 1.1.0 returns:

   a  b   c  d
0  1  i  1i  2
1  1  i  1i  2
2  1  i  1i  2

While in pandas 1.0.5 it returns:

   a   b    c  d
0  1   i   1i  2
1  2   j   2j  3
2  3   k   3k  4

Using python 3.8.3 and IPython 7.16.1.

The Question:

❓ What is the right way of getting the v1.0.5 behavior in v1.1.0?

I did see this release note but honestly can't figure out if this is an intended/unintended side effect of it: https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.1.0.html#apply-and-applymap-on-dataframe-evaluates-first-row-column-only-once

thanks

The text was updated successfully, but these errors were encountered:

rhshadrach · 2020-07-31T01:51:38Z

In great generality, one should not mutate containers when iterating over them.

def test_func(row):
    row = row.copy()
    row['c'] = str(row['a']) + str(row['b'])
    row['d'] = row['a'] + 1
    return row

gives

   a  b   c  d
0  1  i  1i  2
1  2  j  2j  3
2  3  k  3k  4

Of course, the vectorized version of this will be much faster:

%%timeit

df['c'] = df['a'].astype(str) + df['b']
df['d'] = df['a'] + 1

gives 564 µs ± 5.97 µs per loop whereas your version is 5.34 ms ± 16.9 µs per loop.

simonjayhawkins · 2020-07-31T13:12:09Z

Thanks @manihamidi for the report. Same issue as #35462 so closing as duplicate.

manihamidi added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Jul 30, 2020

simonjayhawkins closed this as completed Jul 31, 2020

simonjayhawkins added Duplicate Report Duplicate issue or pull request and removed Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Jul 31, 2020

simonjayhawkins added this to the No action milestone Jul 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QST: is the new behavior of df.apply(my_func, axis=1) in v1.1.0 intended? #35483

QST: is the new behavior of df.apply(my_func, axis=1) in v1.1.0 intended? #35483

manihamidi commented Jul 30, 2020 •

edited

Loading

rhshadrach commented Jul 31, 2020 •

edited

Loading

simonjayhawkins commented Jul 31, 2020

QST: is the new behavior of df.apply(my_func, axis=1) in v1.1.0 intended? #35483

QST: is the new behavior of df.apply(my_func, axis=1) in v1.1.0 intended? #35483

Comments

manihamidi commented Jul 30, 2020 • edited Loading

Question about pandas

The Question:

rhshadrach commented Jul 31, 2020 • edited Loading

simonjayhawkins commented Jul 31, 2020

manihamidi commented Jul 30, 2020 •

edited

Loading

rhshadrach commented Jul 31, 2020 •

edited

Loading