-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Richer options for interpolate
and resample
#4434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@jreback thought we deprecated |
Yes, this is a basic task that really should [edit:] not call for statsmodels, in my opinion. |
Ugly workaround I offered a few days ago: http://stackoverflow.com/a/18276030/1221924 |
since we do use statsmodels/scipy in other parts of the code why don't u peruse sm 5.0 for some available functions here? |
@jseabold do u have direct support in sm 5.0 for interpolation? or do u defer to scipy? |
http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html should be straightforward to directly call these |
via a kind argument (to pandas interpolate) with some kinds passing to scipy functions which are then wrapped on the return |
@jreback agreed about the ease of wrapping interpolate.spline(df.index, df['A'], xnew) to get the interpolated values and then wrapping them up in a Series. I've assumed that the DatFrame's index is the original x-values, which is probably fine for a default but we'd want an argument to say "use this column". I could probably start on this in a few weeks (I have to finish a paper, then I promised the statsmodels guys that I'd setup a vbench for them). |
Can we incorporate this into resample and reindex? Anywhere that And if we do that, can we give the same options that Series.interpolate provides? |
@danielballan +1 on resuing parts or all of this for resample and reindex (and possibly fillna?). I think that it would be relatively easy to handle. Not sure how this would fit in with Jeff's refactor of Series. |
so there exists right now a
its a bit to wrap your head around, but is pretty straightforward |
the key is that both |
lmk if you want to take a stab (or I could set it up for you with the structure and you can add in the other methods) |
If there's no urgency, I'm fine with going through the code to refactor If |
@TomAugspurger that sounds fine; there should be no back compat issue (well...have to make sure, but in theory we have tests for that.....) if the probl need some more tests to validate this |
@TomAugspurger see #1892 as well; this is not conceptually much harder as the |
Statsmodels uses scipy and will likely continue to do so. There is support for "benchmarking" in statsmodels, but this is such a specialized case, I don't think it's worth supporting on your end. |
@jseabold good to know; I think it makes sense for pandas for have some built in methods, and a dispatch to scipy/and or sm to use other methods... |
Starting to take a look at this. Just to get some of the scaffolding straight in my head:
So I'll be adding bits along the way to point things down to A couple questions:
|
If they are calling generic.interpolate, why not just define it once in core/generic and use the axes abstractions there? If you want to opt-out panel, you could just have Panel raise an error... |
I could be wrong but I think |
right now However, their may exist some behavior in the See You can do a new generic tester in You can easily not support As far as actually making the useful change (the point of this PR!). I would simply allow So most of the 'real' changes will occurr in This You should be getting your hands dirty here. Shout out if you need help. |
@TomAugspurger how's this coming? |
I've been a bit intimidated about the internals. I'll give it some time this weekend and maybe waive the white flag if I fail. Is it on the schedule for the next release? |
I think it should be.....lmk if you need help internals are all about breaking stuff!!! lol |
the test suite's pretty good, so that's a helpful guide. |
closed via #4915 |
related #1892, #1479
Is there any interest in giving interpolate and resample (to higher frequency) some additional methods?
For example:
Could return something like
I have never used the DataFrame's interpolate, but a quick glance says that something like the above wouldn't be backwards compatible with the current calling convention. Maybe a different name? This may be confusing two issues: interpolating over missing values and interpolating / predicting non-existent values. Or are they similar enought that they can be treated the same. I would think so.
These are just some quick thoughts before I forget. I haven't spent much time thinking a design through yet. I'd be happy to work on this in a month or so.
Also does this fall in the realm of statsmodels?
The text was updated successfully, but these errors were encountered: