Skip to content

GroupBy transform() throws unexpected exception when sorting each DataFrame #2171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bluefir opened this issue Nov 4, 2012 · 6 comments
Closed
Milestone

Comments

@bluefir
Copy link

bluefir commented Nov 4, 2012

I have the following DataFrame:

data

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 410322 entries, (20111230, '00036110') to (20121019, 'Y8564W10')
Data columns:
market_cap 410117 non-null values
average_volume 410322 non-null values
return_std_daily 410322 non-null values
return_std_monthly 410322 non-null values
dtypes: float64(4)

data.index.names

['date', 'security_id']

I try to do this:

data.groupby(level='date').transform(lambda x: x.sort_index(by='average_volume'))

Traceback (most recent call last):
File "C:\Python27\lib\site-packages\IPython\core\interactiveshell.py", line 2721, in run_code
exec code_obj in self.user_global_ns, self.user_ns
File "", line 1, in
data.groupby(level='date').transform(lambda x: x.sort_index(by='average_volume'))
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 1745, in transform
return self._transform_item_by_item(obj, wrapper)
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 1777, in _transform_item_by_item
raise TypeError('Transform function invalid for data types')
TypeError: Transform function invalid for data types

At the same time the following works

def apply_by_group(grouped, f):
    """
    Applies a function to each Series or DataFrame in a GroupBy object, concatenates the results
    and returns the resulting Series or DataFrame.

    Parameters
    ----------
    grouped: SeriesGroupBy or DataFrameGroupBy
    f: callable
        Function to apply to each Series or DataFrame in the grouped object.

    Returns
    -------
    Series or DataFrame that results from applying the function to each Series or DataFrame in the
    GroupBy object and concatenating the results.

    """
    assert isinstance(grouped, (SeriesGroupBy, DataFrameGroupBy))
    assert hasattr(f, '__call__')

    groups = []
    for key, group in grouped:
        groups.append(f(group))
    return pd.concat(groups)
apply_by_group(data.groupby(level='date'), lambda x: x.sort_index(by='average_volume'))

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 410322 entries, (20111230, '31340030') to (20121019, '03783310')
Data columns:
market_cap 410117 non-null values
average_volume 410322 non-null values
return_std_daily 410322 non-null values
return_std_monthly 410322 non-null values
dtypes: float64(4)

Is this strange or am I doing something wrong with transform()?

@jreback
Copy link
Contributor

jreback commented Mar 28, 2013

can you try with master, this should work now

@jreback
Copy link
Contributor

jreback commented Apr 2, 2013

@bluefir #3145 should have fixed this, pls report if that is accurate...thxs

@bluefir
Copy link
Author

bluefir commented Apr 2, 2013

dev-b41dc91 was made on March 24. Does it contain the fix? I assumed it didn't.

@jreback
Copy link
Contributor

jreback commented Apr 2, 2013

ahh...you need windows....ok...well then will leave this open for a bit, pls check back periodically for the updated builds

@wesm
Copy link
Member

wesm commented Apr 8, 2013

I need to get the build box back up

@wesm
Copy link
Member

wesm commented Apr 10, 2013

Build box is back up, dev binaries should flow through to the pandas website over the next few hours. pls reopen the issue if problem not resolved

@wesm wesm closed this as completed Apr 10, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants