-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
GroupBy using TimeGrouper does not work #3791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You need to set_index as TimeGrouper operates on the index
|
Thanks and is it possible to combine TimeGrouper with another criteria? |
|
Hi jreback, thanks for your reply. Sorry but I do not understand your solution, how can I use it to groupby the TimeGrouper criteria and for example by 'Branch' ? |
This is actuallly a bit tricky
|
@hayd is there a better way to do this you think? |
I'm afraid I have no idea! |
I woudl have thought:
should work.... |
I tried that but already but it raises the exception: "TypeError: 'TimeGrouper' object is not callable" |
did you see my prior comment? |
I would have thought that would work too.... I had tried a few variations of your solution... None of which I could get working (hence the other issue I posted) :) |
@jreback Yes, but that does not work for me either, because I need to apply a self defined function to the formed GroupBy Object. If I already use the simple function above with your solution: df.groupby(pd.TimeGrouper('6M')).apply(lambda x: x.groupby('Branch').apply(testgr)) It raises: "AttributeError: 'DataFrame' object has no attribute 'name'" |
This is basically a composition operation, you group by time, then apply a function which happens to group by branch then operates, so u need to operate on the inner function This is quite tricky in that your function should return a scalar value on the single passed series
|
@jreback Thanks! Unfortunately, it does not solve my problem, as I need to operate on various columns in my function. Is there any possibility to pass the Buyer column to the function? df.set_index('Date').groupby(pd.TimeGrouper('6M')).apply(lambda x: x.groupby('Branch')[['Buyer', 'Quantity']].apply(testgr)) # doubles the buyer names |
If you return a custom function then you need to handle the string cases, but you can return pretty much anything you want (make it a Series) to get this kind of functionaility, you function is passed a slice of the original frame
|
Great solution ! Thanks a lot |
great...glad it worked out (and I am going to open an issue about a defect in that:
should work... thanks for bringing it up! |
It appears other methods that work on normal groups fail when using the TimeGrouper. For instance
works fine, but:
gives error:
Am I misunderstanding something? Edit: Looks like this has become an open issue: #3881 |
see #3881, already noted, thank you |
Maybe you shall try to use the parameter explicitly,like freq=‘5min’,this could be efficient. |
BUG: TimeGrouper not too friendly with other groups, e.g.
df.set_index('Date').groupby([pd.TimeGrouper('6M'),'Branch']).sum()
should work
Hi everybody,
I found two issues with TimeGrouper:
Let's take the following example:
df = pd.DataFrame({
'Branch' : 'A A A A A B'.split(),
'Buyer': 'Carl Mark Carl Joe Joe Carl'.split(),
'Quantity': [1,3,5,8,9,3],
'Date' : [
DT.datetime(2013,1,1,13,0),
DT.datetime(2013,1,1,13,5),
DT.datetime(2013,10,1,20,0),
DT.datetime(2013,10,3,10,0),
DT.datetime(2013,12,2,12,0),
DT.datetime(2013,12,2,14,0),
]})
gr = df.groupby(pd.TimeGrouper(freq='6M'))
def testgr(df):
print df
gr.apply(testgr)
This will raise the Exception: "Exception: All objects passed were None"
Thank you very much
Andy
The text was updated successfully, but these errors were encountered: