-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Assorted DatetimeIndex bugfixes #24157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
092b3b9
fbcc04b
ac52857
121c373
ef66bb9
91738d3
bb0d065
ff3a5c0
e54159d
44e0126
12e0f4e
28bf2de
06d0a8e
81c7d0f
97f976f
9e55d97
c7f280f
af303c7
8ece686
30f01a0
9617b85
3731098
df05c88
8f39b23
e958ce6
97cb6b3
36d37e4
3fc1c19
88f7094
e178bb9
50f6b7e
b927925
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -380,6 +380,7 @@ Backwards incompatible API changes | |
- ``max_rows`` and ``max_cols`` parameters removed from :class:`HTMLFormatter` since truncation is handled by :class:`DataFrameFormatter` (:issue:`23818`) | ||
- :meth:`read_csv` will now raise a ``ValueError`` if a column with missing values is declared as having dtype ``bool`` (:issue:`20591`) | ||
- The column order of the resultant :class:`DataFrame` from :meth:`MultiIndex.to_frame` is now guaranteed to match the :attr:`MultiIndex.names` order. (:issue:`22420`) | ||
- :func:`pd.offsets.generate_range` argument ``time_rule`` has been removed; use ``offset`` instead (:issue:`24157`) | ||
|
||
Percentage change on groupby changes | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
@@ -1133,7 +1134,6 @@ Deprecations | |
- In :meth:`Series.where` with Categorical data, providing an ``other`` that is not present in the categories is deprecated. Convert the categorical to a different dtype or add the ``other`` to the categories first (:issue:`24077`). | ||
- :meth:`Series.clip_lower`, :meth:`Series.clip_upper`, :meth:`DataFrame.clip_lower` and :meth:`DataFrame.clip_upper` are deprecated and will be removed in a future version. Use ``Series.clip(lower=threshold)``, ``Series.clip(upper=threshold)`` and the equivalent ``DataFrame`` methods (:issue:`24203`) | ||
|
||
|
||
.. _whatsnew_0240.deprecations.datetimelike_int_ops: | ||
|
||
Integer Addition/Subtraction with Datetime-like Classes Is Deprecated | ||
|
@@ -1310,6 +1310,9 @@ Datetimelike | |
- Bug in :class:`Index` where calling ``np.array(dtindex, dtype=object)`` on a timezone-naive :class:`DatetimeIndex` would return an array of ``datetime`` objects instead of :class:`Timestamp` objects, potentially losing nanosecond portions of the timestamps (:issue:`23524`) | ||
- Bug in :class:`Categorical.__setitem__` not allowing setting with another ``Categorical`` when both are undordered and have the same categories, but in a different order (:issue:`24142`) | ||
- Bug in :func:`date_range` where using dates with millisecond resolution or higher could return incorrect values or the wrong number of values in the index (:issue:`24110`) | ||
- Bug in :class:`DatetimeIndex` where constructing a :class:`DatetimeIndex` from a :class:`Categorical` or :class:`CategoricalIndex` would incorrectly drop timezone information (:issue:`18664`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where is the note for #11587? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just pushed, should be just below this line. |
||
- Bug in :class:`DatetimeIndex` and :class:`TimedeltaIndex` where indexing with ``Ellipsis`` would incorrectly lose the index's ``freq`` attribute (:issue:`21282`) | ||
- Clarified error message produced when passing an incorrect ``freq`` argument to :class:`DatetimeIndex` with ``NaT`` as the first entry in the passed data (:issue:`11587`) | ||
|
||
Timedelta | ||
^^^^^^^^^ | ||
|
@@ -1422,6 +1425,7 @@ Indexing | |
- Bug in :func:`Index.union` and :func:`Index.intersection` where name of the ``Index`` of the result was not computed correctly for certain cases (:issue:`9943`, :issue:`9862`) | ||
- Bug in :class:`Index` slicing with boolean :class:`Index` may raise ``TypeError`` (:issue:`22533`) | ||
- Bug in ``PeriodArray.__setitem__`` when accepting slice and list-like value (:issue:`23978`) | ||
- Bug in :class:`DatetimeIndex`, :class:`TimedeltaIndex` where indexing with ``Ellipsis`` would lose their ``freq`` attribute (:issue:`21282`) | ||
|
||
Missing | ||
^^^^^^^ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,9 +14,9 @@ | |
from pandas.util._decorators import Appender | ||
|
||
from pandas.core.dtypes.common import ( | ||
_INT64_DTYPE, _NS_DTYPE, is_datetime64_dtype, is_datetime64tz_dtype, | ||
is_extension_type, is_float_dtype, is_int64_dtype, is_object_dtype, | ||
is_period_dtype, is_string_dtype, is_timedelta64_dtype) | ||
_INT64_DTYPE, _NS_DTYPE, is_categorical_dtype, is_datetime64_dtype, | ||
is_datetime64tz_dtype, is_extension_type, is_float_dtype, is_int64_dtype, | ||
is_object_dtype, is_period_dtype, is_string_dtype, is_timedelta64_dtype) | ||
from pandas.core.dtypes.dtypes import DatetimeTZDtype | ||
from pandas.core.dtypes.generic import ABCIndexClass, ABCSeries | ||
from pandas.core.dtypes.missing import isna | ||
|
@@ -264,6 +264,8 @@ def _generate_range(cls, start, end, periods, freq, tz=None, | |
if closed is not None: | ||
raise ValueError("Closed has to be None if not both of start" | ||
"and end are defined") | ||
if start is NaT or end is NaT: | ||
raise ValueError("Neither `start` nor `end` can be NaT") | ||
|
||
left_closed, right_closed = dtl.validate_endpoints(closed) | ||
|
||
|
@@ -1652,6 +1654,13 @@ def maybe_convert_dtype(data, copy): | |
raise TypeError("Passing PeriodDtype data is invalid. " | ||
"Use `data.to_timestamp()` instead") | ||
|
||
elif is_categorical_dtype(data): | ||
# GH#18664 preserve tz in going DTI->Categorical->DTI | ||
# TODO: cases where we need to do another pass through this func, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is the TODO case here, what is an example? do you have an xfailed test? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. example is Categorical(TimedeltaIndex) where we should do another pass through maybe_convert_dtype in order to issue the FutureWarning. I'm assuming there are other corner cases here that will need to be caught/tested. |
||
# e.g. the categories are timedelta64s | ||
data = data.categories.take(data.codes, fill_value=NaT) | ||
copy = False | ||
|
||
elif is_extension_type(data) and not is_datetime64tz_dtype(data): | ||
# Includes categorical | ||
# TODO: We have no tests for these | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,12 +14,42 @@ | |
from pandas import ( | ||
DatetimeIndex, Index, Timestamp, date_range, datetime, offsets, | ||
to_datetime) | ||
from pandas.core.arrays import period_array | ||
from pandas.core.arrays import ( | ||
DatetimeArrayMixin as DatetimeArray, period_array) | ||
import pandas.util.testing as tm | ||
|
||
|
||
class TestDatetimeIndex(object): | ||
|
||
@pytest.mark.parametrize('dt_cls', [DatetimeIndex, DatetimeArray]) | ||
def test_freq_validation_with_nat(self, dt_cls): | ||
# GH#11587 make sure we get a useful error message when generate_range | ||
# raises | ||
msg = ("Inferred frequency None from passed values does not conform " | ||
"to passed frequency D") | ||
with pytest.raises(ValueError, match=msg): | ||
dt_cls([pd.NaT, pd.Timestamp('2011-01-01')], freq='D') | ||
with pytest.raises(ValueError, match=msg): | ||
dt_cls([pd.NaT, pd.Timestamp('2011-01-01').value], | ||
freq='D') | ||
|
||
def test_categorical_preserves_tz(self): | ||
# GH#18664 retain tz when going DTI-->Categorical-->DTI | ||
# TODO: parametrize over DatetimeIndex/DatetimeArray | ||
# once CategoricalIndex(DTA) works | ||
|
||
dti = pd.DatetimeIndex( | ||
[pd.NaT, '2015-01-01', '1999-04-06 15:14:13', '2015-01-01'], | ||
tz='US/Eastern') | ||
|
||
ci = pd.CategoricalIndex(dti) | ||
carr = pd.Categorical(dti) | ||
cser = pd.Series(ci) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you parametrize There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will take a look There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not without adding casting code that makes the verbosity a wash. Once Categorical(DatetimeArray) works (currently raises bc DatetimeArray doesn't have _constructor) then this test can be parametrized over DatetimeIndex/DatetimeArray |
||
for obj in [ci, carr, cser]: | ||
result = pd.DatetimeIndex(obj) | ||
tm.assert_index_equal(result, dti) | ||
|
||
def test_dti_with_period_data_raises(self): | ||
# GH#23675 | ||
data = pd.PeriodIndex(['2016Q1', '2016Q2'], freq='Q') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this probably won't render as its not in the api.rst but ok