You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In [1]: importnumpyasnp; importpandasaspd; pd.__version__Out[1]: '0.25.0.dev0+783.g2b9b58dad'In [2]: ts_list= [pd.Timestamp('2018-01-01'), np.nan]
...: tstz_list= [pd.Timestamp('2018-01-01', tz='UTC'), np.nan]
In [3]: pd.core.indexes.base.ensure_index(ts_list)
Out[3]: DatetimeIndex(['2018-01-01', 'NaT'], dtype='datetime64[ns]', freq=None)
In [4]: pd.core.indexes.base.ensure_index(tstz_list)
Out[4]: Index([2018-01-0100:00:00+00:00, nan], dtype='object')
Problem description
Out[4] does not coerce np.nan to pd.NaT and results in an Index with object dtype instead of a DatetimeIndex.
This causes downstream issues with IntervalIndex/IntervalArray as it can cause a valid IntervalIndex/IntervalArray to not be roundtripable from it's equivalent list/np.array representation:
Under the hood the list/np.array is being converted to left/right components, which are then passed to ensure_index, resulting in an Index with object dtype, hence the error message.
Note that the equivalent roundtrip without a tz works fine, as expected based on the inconsistency noted in the ensure_index example:
Code Sample, a copy-pastable example if possible
On master:
Problem description
Out[4]
does not coercenp.nan
topd.NaT
and results in anIndex
withobject
dtype instead of aDatetimeIndex
.This causes downstream issues with
IntervalIndex
/IntervalArray
as it can cause a validIntervalIndex
/IntervalArray
to not be roundtripable from it's equivalentlist
/np.array
representation:Under the hood the
list
/np.array
is being converted toleft
/right
components, which are then passed toensure_index
, resulting in anIndex
withobject
dtype, hence the error message.Note that the equivalent roundtrip without a tz works fine, as expected based on the inconsistency noted in the
ensure_index
example:Expected Output
I'd expect
Out[4]
to be coerced to aDatetimeIndex
withpd.NaT
and the appropriate tz:Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 2b9b58d
python : 3.7.3.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.14-041914-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.0.dev0+783.g2b9b58dad
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 40.8.0
Cython : 0.29.10
pytest : 4.6.2
hypothesis : 4.23.6
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.3
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.5.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : 0.3.0
gcsfs : None
matplotlib : 3.1.0
numexpr : 2.6.9
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : 0.11.1
pytables : None
s3fs : 0.2.1
scipy : 1.2.1
sqlalchemy : 1.3.4
tables : 3.5.2
xarray : 0.12.1
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8
The text was updated successfully, but these errors were encountered: