Confusing exception message when re-sampling index containing NaT #16356
Labels
Datetime
Datetime data dtype
Error Reporting
Incorrect or improved errors from pandas
Resample
resample method
Example
Problem description
If a DataFrame's index contains
NaT
, when you try to resample it you get the error:ValueError: Passed item and index have different timezone
. This is confusing, because while technically correct, it leads the user to investigate timezones, and not the presence of missing times.The error message lead me down a rabbit hole of trying to understand how a wrong timezone had crept in, but the timezones were consistent - it was the
NaT
which is a problem. I was not expecting my (large, >500k) index to containNaT
. There was a singleNaT
entry buried right in the middle of it and I had to bisect the dataset to figure out why it wasn't working.Searching for the error message shows up very few hits (currently 3, mostly source code).
Expected Output
Either to work and omit the value, or an error message which makes it clear that the problem is a
NaT
value, not a timezone issue.Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-1013-aws
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 35.0.2
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 6.0.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.1.2
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: 0.0.9
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: