-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC/API: limit argument to interpolate is somewhat non-intuitive #9218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@shoyer this is very well defined when reindexing (e.g. for your first point, you simply don't fill forward/backward anymore). So should be the same for interpolation (and btw, you can require that a method be provided, e.g. |
@jreback I agree this is well defined for |
ahh that's right but I think the direction of interpolation is by definition forward always right? (as these are mostly scipy routines) ? |
Take a look at the scipy routines -- they really make no reference to forwards or backwards. They just interpolate between points. As you can see here, we handle the |
no what I mean is interpolation is by definition forward ATM can think about a direction kw |
@shoyer correct, we do all the limit handling.
I think I remember a SO post asking how to do this actually...
I guess it depends on the context. You might not want to fill backward for some time series. But that's up to the user to decide, so having the option would be nice. As far as implementation goes, it shouldn't be too difficult, just rework the limit logic a bit. I wrote that originally, but I'm not sure I'll have time to get to this soon 😞 |
I just tried adding a little bit of code to address this issue! I also updated the test code that seemed to be aimed at interpolation limits. I've never contributed to Pandas before so I'm not sure what the procedure usually is. I'd be happy if there was some discussion to make sure this change makes sense. I'm also unsure where to update documentation for the change. What should I do next? |
Ok, thanks! I suppose the discussion about expected behavior, etc. will take place on the PR. |
For posterity, here is the SO post that discusses this: |
The current documentation for the limit parameter looks like this:
Several features were not immediately obvious to me and could probably be resolved by slightly better docstring or API design:
What happens where the max number of consecutive NaNs is reached? pandas simply stops replacing values, e.g., if there is a gap of 3 values and limit=2, pandas replaces the first 2 values. The alternative would be to entirely ignore gaps of more than 2 values. I'm not saying this would be a better alternative, but we should clarify this.
The limit refers to forward filling only, even though interpolation is not inherently directional. It would be nice if there was some way to trigger a limit for back filling at the same time, e.g., with
forward_limit
andbackward_limit
arguments:Thoughts?
The text was updated successfully, but these errors were encountered: