Skip to content

select() within a function closure not working as agg function #1423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dalejung opened this issue Jun 7, 2012 · 2 comments
Closed

select() within a function closure not working as agg function #1423

dalejung opened this issue Jun 7, 2012 · 2 comments
Labels
Milestone

Comments

@dalejung
Copy link
Contributor

dalejung commented Jun 7, 2012

I'm running into a weird issue with groupby and function closure. For some reason the function closure doesn't work unless I access the grouped series. You can see in agg_before I have a fix flag that will just access the data var.

from pandas import *                                                                                  
import numpy as np                                                                                    

periods = 1000                                                                                        
ind = DatetimeIndex(start='2012/1/1', freq='5min', periods=periods)                                   
df = DataFrame({'high': np.arange(periods), 'low': np.arange(periods)}, index=ind)                    

def agg_before(hour, func, fix=False):                                                                
    """                                                                                               
        Run an aggregate func on the subset of data.                                                  
    """                                                                                               
    def _func(data):                                                                                  
        d = data.select(lambda x: x.hour < 11).dropna()                                               
        if fix:                                                                                       
            data[data.index[0]]                                                                       
        if len(d) == 0:                                                                               
            return None                                                                               
        return func(d)                                                                                
    return _func                                                                                      

def afunc(data):                                                                                      
    d = data.select(lambda x: x.hour < 11).dropna()                                                   
    return np.max(d)                                                                                  

grouped = df.groupby(lambda x: datetime(x.year, x.month, x.day))                                      

closure_bad = grouped.agg({'high': agg_before(11, np.max)})                                           
closure_good = grouped.agg({'high': agg_before(11, np.max, True)})                                    
lambda_good = grouped.agg({'high': afunc})                         
In [33]: np.__version__
Out[39]: '1.6.2'

In [34]: pandas.__version__
Out[34]: '0.8.0.dev-dc6ce90'

In [35]: closure_bad
Out[35]: 
            high
2012-01-01   131
2012-01-02   NaN
2012-01-03   NaN
2012-01-04   NaN

In [36]: closure_good
Out[36]: 
            high
2012-01-01   131
2012-01-02   419
2012-01-03   707
2012-01-04   995

In [37]: lambda_good
Out[37]: 
            high
2012-01-01   131
2012-01-02   419
2012-01-03   707
2012-01-04   995

Running an agg function that isn't a closure works fine. Any ideas on this?

@wesm
Copy link
Member

wesm commented Jun 11, 2012

Hey @dalejung thanks for tracking this down and the test case. I found the issue and it's been fixed, will be in 0.8.0

@dalejung
Copy link
Contributor Author

@wesm np. Was definitely a fun one to stumble across.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants