-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Series.interpolate with index of type float gives wrong result #1206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is implemented in git master now and will be part of the 0.8.0 release |
Version 0.8.0 beta 1 * tag 'v0.8.0b1': (703 commits) RLS: 0.8.0 beta 1 RLS: 0.8.0beta RLS: release notes ENH: add option to use Series.values to interpolate, close pandas-dev#1206 TST: testing to close pandas-dev#1331 DOC: groupby drop duplicate index pandas-dev#1312 ENH: tz_convert for DataFrame pandas-dev#1330 ENH: add NA handling to scatter_matrix, close pandas-dev#1297 BUG: display localtime in DatetimeIndex.__repr__, close pandas-dev#1336 DOC: draft of timeseries section of docs. Added Period related documentation and examples DOC: timezone handling and started on Period DOC: rough draft of DatetimeIndex, date_range, shifting/resampling etc DOC: more ts docs. Need to do resampling then PeriodIndex DOC: starting deeper revamp of ts docs for 0.8 BUG: raise exception for unintellible frequency strings, close pandas-dev#1328 ENH: construct PeriodIndex from arrays of fields, allow negative ordinals. close pandas-dev#1333 and pandas-dev#1264 BUG: tsplot fix with business freq pandas-dev#1332 BUG: DatetimeIndex partial slicing bug, tsplot kludge around pandas-dev#1332 BUG: alias W to W-SUN, add test for buglet close pandas-dev#1327 ENH: mix arrays and scalars in DataFrame constructor, close pandas-dev#1329 ...
This seems to still be a problem as of 0.8.1.dev-e2633d4. import pandas
import numpy as np
import pylab as pl
from scipy.interpolate import interp1d
time_fast = np.arange(50000.,50010.,.4) +.1
time_slow = np.arange(50000.,50010.,1.)
x_fast = np.sin(time_fast)
x_slow = np.sin(time_slow)
df_fast = pandas.DataFrame(x_fast, index=time_fast, columns=['fast'])
df_slow = pandas.DataFrame(x_slow, index=time_slow, columns=['slow'])
df_joined = df_fast.join(df_slow, how='outer')
df_joined['pandas interpolate'] = df_joined['slow'].interpolate()
f = interp1d(df_slow.index, df_slow['slow'], bounds_error=False)
df_joined['scipy interp1d'] = f(df_joined.index)
df_joined['pandas interpolate'].plot(style='o')
df_joined['scipy interp1d'].plot(style='o')
df_slow['slow'].plot(style='r.:')
pl.title('Linearly interpolated points are expected to lie on the dotted red lines.')
pl.legend()
pl.show() |
I also observed this with indices that are Datetime objects. The title of this issue may be too narrow. |
@nlsn you have to do:
The default of interpolate assumes that each value is evenly spaced, while |
Hi,
The logic for Series.interpolate assumes the indexes are equally spaced. With a floating point index this is not the desired interpolation. For example:
I expect sig1 and sig2 to have more points than df1 and df2 but with the values interpolated. There are a few points that are not overlapping because it is assumed they are equally spaced. In my opinion if the index is a floating point the user wants to interpolate by the index's value and don't assume they are equally spaced. It should do something like this:
Thanks
The text was updated successfully, but these errors were encountered: