-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG/API: inconsistent results in a groupby-apply when mix of scalar/Series are returned #5592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I stumbled and fell harshly and scrambled for several days to recover from this one. :( def process_calblock(df):
cb = CalBlock(df, self.SV_NUM_SKIP_SAMPLE)
if cb.kind == "ST":
return
return cb.bb_time and using calib_block_labels
1 2009-09-28 09:07:58.558000
2 2009-09-28 09:18:15.019000
3 2009-09-28 09:27:30.039000
4 None
5 2009-09-28 09:38:49.989000
6 2009-09-28 09:49:06.450000
7 2009-09-28 09:59:24.959000
dtype: object Since this commit I received the most mysterious error message: ---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-2-cfb5bb4d8eec> in <module>()
10 # cgood = calib.Calibrator(dfgood)
11 cbad = calib.Calibrator(dfbad)
---> 12 cbad.calgrouped.apply(process_calblock)
/Users/maye/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas-0.12.0_1169_g9aae1a8-py2.7-macosx-10.6-x86_64.egg/pandas/core/groupby.pyc in apply(self, func, *args, **kwargs)
371 return func(g, *args, **kwargs)
372
--> 373 return self._python_apply_general(f)
374
375 def _python_apply_general(self, f):
/Users/maye/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas-0.12.0_1169_g9aae1a8-py2.7-macosx-10.6-x86_64.egg/pandas/core/groupby.pyc in _python_apply_general(self, f)
377
378 return self._wrap_applied_output(keys, values,
--> 379 not_indexed_same=mutated)
380
381 def aggregate(self, func, *args, **kwargs):
/Users/maye/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas-0.12.0_1169_g9aae1a8-py2.7-macosx-10.6-x86_64.egg/pandas/core/groupby.pyc in _wrap_applied_output(self, keys, values, not_indexed_same)
2132 if v is None:
2133 return DataFrame()
-> 2134 values = [ x if x is not None else v._constructor(**v._construct_axes_dict()) for x in values ]
2135
2136 v = values[0]
AttributeError: 'Timestamp' object has no attribute '_constructor' which, as I understand from the stackoverflow, has something to do with the fact that I return None? So, do I have to change my code, because this is a change in the API or is this a regression? |
Can you post something small and reproducible that generates this error? |
@michaelaye are you using current master? this should work |
oh...sorry...you are returning None or a Timestamp or a Series? (or just None or a Timestamp)...hmmm |
Yes, exactly. I usually return a timestamp which is the middle time point of the group, but for certain cases I should not use this time and then I returned None. This does not work in master. I guess I can work around by filtering the dataframe already before grouping. |
ok...was a minor fix.....PR #5675; also the dtypes of the columns should be correct as well if the original is a datetimelike and you return a None (it will be datetime64[ns] with the Nones as NaTs) In this case you will get a series back that is correctly dtyped |
I confirm that PR #5675 works here, thanks for the quick fix! calib_block_labels
1 2009-09-28 09:07:58.558000
2 2009-09-28 09:18:15.019000
3 2009-09-28 09:27:30.039000
4 NaT
5 2009-09-28 09:38:49.989000
6 2009-09-28 09:49:06.450000
7 2009-09-28 09:59:24.959000
dtype: datetime64[ns] |
gr8! keep em coming! |
http://stackoverflow.com/questions/20224564/how-does-pandas-grouped-apply-decide-on-output-and-why-does-this-depend-on-w/20225276#20225276
The text was updated successfully, but these errors were encountered: