Skip to content

Groupby/apply behaves differently when grouping column contains tuples #19588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jhrmnn opened this issue Feb 8, 2018 · 1 comment
Closed

Comments

@jhrmnn
Copy link

jhrmnn commented Feb 8, 2018

Code Sample, a copy-pastable example if possible

This is adapted from the docs, just replacing column 'a' with a list of tuples:

import pandas as pd

df = pd.DataFrame({
        'a':  [(0,), (0,), (0,), (0,), (1,), (1,), (1,), (1,), (2,), (2,), (2,), (2,)],
        'b':  [0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1],
        'c':  [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0],
        'd':  [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1],
        })

def compute_metrics(x):
    result = {'b_sum': x['b'].sum(), 'c_mean': x['c'].mean()}
    return pd.Series(result, name='metrics')

df.groupby('a').apply(compute_metrics)

Problem description

Without the modification, the return value is a dataframe with the apply-returned Series objects concatenated. With the modification, it is a Series object filled with the individual Series objects.

Expected Output

The same behavior with and without modification.

Background

The divergence in the behavior is caused by the code in pandas/core/index.py introduced in #10703, which was a reaction on #10697. Simply commenting out the if block if all( isinstance(e, tuple) for e in data ): solves the issue.

@jhrmnn
Copy link
Author

jhrmnn commented Feb 8, 2018

Ok, digging deeper in the code, I realize that using tuples inside indexes probably won't really work well with pandas, because there's too much "tuple detection" all over the place.

@jhrmnn jhrmnn closed this as completed Feb 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant