-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Allow aggregate numeric operations on timedelta64. #6884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FYI their is an issue for adding timedelta support to groupby - see if u can find it also this needs a fair amount of tests as trit are many paths thru groupby basically need to mimic most of the datetime64 type tests |
Have we established some pattern for skipping timedelta tests for numpy < 1.7? Also, while I'm here, I'll fix the English in this error.... ValueError: to_timedelta is not support for numpy < 1.7 |
numpy timedelta is thoroughly broken < 1.7 |
@danielballan hows' this coming? |
A quick, perhaps obvious, thought: this is not exactly analogous to what was done for datetime. For example, summing datetimes is (correctly) not defined but summing timedeltas should be. But many of the tests will be analogous. I have to put this aside until next week, but I can pick it up by the beginning of May. If someone wants to tap in before then, feel free to speak up. |
gr8 thanks would like like to get some minimal set at least in for 0.14 as I agree it's a bit nontrivial |
@danielballan can you rebase, add a couple more tests, and release note / v0.14.0.txt ? |
I am working on finishing this patch, and I'm not convinced this is the right approach. What about making timedeltas considered a numeric type? Otherwise we're going to have to update all locations where there are is_numeric checks. I suppose this opens the question about what "is_numeric" means. In particular, the current Pandas master does
I just started using Pandas today, so it's possible the original patch is the right approach. |
I considered lumping time deltas in with is_numeric, but I decided it was On Tuesday, April 29, 2014, Keir Mierle [email protected] wrote:
|
@danielballan, I am operating off trunk, which seems to have changed one of the functions (_cython_agg_blocks). I ported the code tried my approach to make timedelta is_numeric, but I am now not convinced of the approach. Do you want to update to master? |
OK. I updated to master this morning. See the latest version of the commit. (N.B. The tests are broken; this is a work-in-progress commit.) |
@danielballan how's this coming? |
well I want to get a RC1 out, but if you are ready after, then ok |
FYI, I think #5986 for quantile has been fixed by @TomAugspurger very recently |
I don't know if I tested timedeltas at all in |
I don't suggest merging this patch. I have a new, smaller, different patch that changes timedelta64 to be a numeric type. I will try to upload it later today but currently I have some production issues. |
@keir Does that approach create problems when |
@danielballan 0.14.0 released (ann coming later today), but this could be on master/0.14.1 in any event |
@danielballan can you rebase this? |
@danielballan revisit? i'd like to get into 0.15 |
@jreback I won't have time to touch this again until January, but at that point I'll be joining mpl's @tacaswell at Brookhaven, and I will have some solid time to contribute again. |
@danielballan np, should be straight forward with the new have fun, according to cc @ericdill you have LOTS of data! |
I was glad to see |
@danielballan can you revisit / update |
OK, I'm back on the job. :- D |
My sanity thanks you.
|
648c4dd
to
b47d3c8
Compare
closing pls reopen if/when updated |
Trying to work out why groupby was dropping my datetime columns and came across this. Did it never get fixed? |
Looks like this PR went stale. Let us know if you want to pick it up @rjp23.
…On Sun, Jun 9, 2019 at 10:58 PM rjp23 ***@***.***> wrote:
Trying to work out why groupby was dropping my datetime columns and came
across this. Did it never get fixed?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6884?email_source=notifications&email_token=AAKAOITYC6WMYKN2WA3RL7DPZXGPXA5CNFSM4AOJZLUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXI3VQQ#issuecomment-500284098>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIWBUEKLA4TI2K5CIS3PZXGPXANCNFSM4AOJZLUA>
.
|
Well it should be fixed but I don't think I'm knowledgeable enough to do it. |
This seems to have been fixed somewhere between version 1.3.4 and version 2.2.0 (two versions I had convenient to test) but I'm not sure when or how. If someone knows, can they put it here for people who end up here confused in the future, as I did? |
closes #5724
Currently,
timedelta64
columns can't be used in aggregate group operations. They are quietly dropped or, if no other columns are present, an exception is raised:But any operation that works on numerical data is also well defined on timedeltas. This PR relaxes the restriction on those methods designated "numeric only" to accept numeric dtypes and
timedelta64
. I include a unit test with one simple case each forDataFrameGroupBy
andSeriesGroupBy
.