BUG: throwing error in interpolate depending on dtype of column names #33956

CloseChoice · 2020-05-03T18:47:40Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas. (on 862db64, last commit where build works as of 20:42 2020-05-03).

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

df
Out[6]: 
     A     B     C
0  1.0   2.0   3.0
1  2.0   4.0   6.0
2  3.0   6.0   9.0
3  4.0   NaN   NaN
4  NaN   8.0   NaN
5  5.0  10.0  30.0
df.interpolate(method='ffill', axis=1)

Throws an error ValueError: Index column must be numeric or datetime type when using ffill method other than linear. Try setting a numeric or datetime index column before interpolating. But the following code works:

df.columns = [1, 2, 3]
df
Out[9]: 
     1     2     3
0  1.0   2.0   3.0
1  2.0   4.0   6.0
2  3.0   6.0   9.0
3  4.0   NaN   NaN
4  NaN   8.0   NaN
5  5.0  10.0  30.0
df.interpolate(method='ffill', axis=1)
Out[10]: 
     1     2     3
0  1.0   2.0   3.0
1  2.0   4.0   6.0
2  3.0   6.0   9.0
3  4.0   6.0   9.0
4  4.0   8.0   9.0
5  5.0  10.0  30.0

Problem description

Throwing an error should not depend on the dtype of column names.

Expected Output

    A     B     C
0  1.0   2.0   3.0
1  2.0   4.0   6.0
2  3.0   6.0   9.0
3  4.0   6.0   9.0
4  4.0   8.0   9.0
5  5.0  10.0  30.0

Output of `pd.show_versions()`

pd.show_versions()
INSTALLED VERSIONS
------------------
commit           : 862db6421256cb7a00ae3e88a4a6999347b76271
python           : 3.8.2.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.3.0-51-generic
Version          : #44~18.04.2-Ubuntu SMP Thu Apr 23 14:27:18 UTC 2020
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8
pandas           : 1.1.0.dev0+1463.g862db6421
numpy            : 1.18.1
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.1
setuptools       : 46.1.3.post20200325
Cython           : 0.29.17
pytest           : 5.4.1
hypothesis       : 5.10.4
sphinx           : 3.0.3
blosc            : None
feather          : None
xlsxwriter       : 1.2.8
lxml.etree       : 4.5.0
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : 7.13.0
pandas_datareader: None
bs4              : 4.9.0
bottleneck       : 1.3.2
fastparquet      : 0.3.3
gcsfs            : None
matplotlib       : 3.2.1
numexpr          : 2.7.1
odfpy            : None
openpyxl         : 3.0.3
pandas_gbq       : None
pyarrow          : 0.17.0
pytables         : None
pyxlsb           : None
s3fs             : 0.4.2
scipy            : 1.4.1
sqlalchemy       : 1.3.16
tables           : 3.6.1
tabulate         : 0.8.7
xarray           : 0.15.1
xlrd             : 1.2.0
xlwt             : 1.3.0
numba            : 0.48.0

The text was updated successfully, but these errors were encountered:

CloseChoice · 2020-05-03T18:59:21Z

The problem is that the axis is used instead of indices here. We should interpolate on the given axis but perform the check only on the indices.

Edit: This is a regression.

… dtype string (pandas-dev#33956)

simonjayhawkins · 2020-05-12T12:31:23Z

hmm, not sure about this. the axis seems reversed here, see #29146. also ffill (and bfill) are not documented values for method see https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.interpolate.html

maybe those issue should be resolved, before fixing this.

CloseChoice · 2020-05-13T22:17:39Z

take

CloseChoice added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 3, 2020

CloseChoice added a commit to CloseChoice/pandas that referenced this issue May 3, 2020

fix bfill, ffill and pad when calling with df.interpolate with column…

012806e

… dtype string (pandas-dev#33956)

CloseChoice mentioned this issue May 3, 2020

fix bfill, ffill and pad when calling with df.interpolate with column… #33959

Merged

7 tasks

jreback added this to the 1.1 milestone May 12, 2020

jreback added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 12, 2020

github-actions bot assigned CloseChoice May 13, 2020

jreback closed this as completed in #33959 Jun 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: throwing error in interpolate depending on dtype of column names #33956

BUG: throwing error in interpolate depending on dtype of column names #33956

CloseChoice commented May 3, 2020 •

edited

Loading

CloseChoice commented May 3, 2020 •

edited

Loading

simonjayhawkins commented May 12, 2020

CloseChoice commented May 13, 2020

BUG: throwing error in interpolate depending on dtype of column names #33956

BUG: throwing error in interpolate depending on dtype of column names #33956

Comments

CloseChoice commented May 3, 2020 • edited Loading

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

CloseChoice commented May 3, 2020 • edited Loading

simonjayhawkins commented May 12, 2020

CloseChoice commented May 13, 2020

CloseChoice commented May 3, 2020 •

edited

Loading

Output of `pd.show_versions()`

CloseChoice commented May 3, 2020 •

edited

Loading