Skip to content

REF: make DatetimeIndex._simple_new actually simple #32282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Mar 14, 2020
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
6389872
REF: require DTA in DTI._simple_new
jbrockmendel Jan 17, 2020
2329008
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 17, 2020
91a0937
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 18, 2020
8924c55
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 18, 2020
b55e8b1
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 20, 2020
d04f3b5
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 21, 2020
3a2c228
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 21, 2020
b6ffc2a
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 23, 2020
5b0bac8
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 25, 2020
7405488
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 26, 2020
13eb9a7
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 26, 2020
aadedfd
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 29, 2020
9f220f5
stricter
jbrockmendel Jan 29, 2020
1451163
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Jan 29, 2020
e8ad19a
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 1, 2020
e8483f8
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 4, 2020
f82f389
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 7, 2020
5f22827
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 9, 2020
9a5360a
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 9, 2020
1d9cea3
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 12, 2020
1acb5a3
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 22, 2020
d92ff5f
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 25, 2020
34b4dec
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 25, 2020
242695c
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
50d1c6c
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
8a90f45
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
0e7ffe1
blackify
jbrockmendel Feb 26, 2020
5cba722
remove attrs
jbrockmendel Feb 26, 2020
6efd065
CLN: DTI/TDI _simple_new
jbrockmendel Feb 26, 2020
893d0aa
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 26, 2020
d659a23
simplify apply_index treatment
jbrockmendel Feb 27, 2020
864b604
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 27, 2020
743a003
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Feb 27, 2020
82c5645
remove outdated docstring
jbrockmendel Feb 27, 2020
e91814d
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 11, 2020
10a2ff2
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 12, 2020
c8c3ebb
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 12, 2020
711987e
Merge branch 'master' of https://github.com/pandas-dev/pandas into si…
jbrockmendel Mar 13, 2020
e74ebc0
Comments, collect offsets wrapping in apply_index_wrap
jbrockmendel Mar 13, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -3308,13 +3308,11 @@ def reindex(self, target, method=None, level=None, limit=None, tolerance=None):
target = _ensure_has_len(target) # target may be an iterator

if not isinstance(target, Index) and len(target) == 0:
attrs = self._get_attributes_dict()
attrs.pop("freq", None) # don't preserve freq
if isinstance(self, ABCRangeIndex):
values = range(0)
else:
values = self._data[:0] # appropriately-dtyped empty array
target = self._simple_new(values, **attrs)
target = self._simple_new(values, name=self.name)
else:
target = ensure_index(target)

Expand Down
22 changes: 9 additions & 13 deletions pandas/core/indexes/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -648,21 +648,11 @@ def _shallow_copy(self, values=None, name: Label = lib.no_default):
if values is None:
values = self._data

if isinstance(values, type(self)):
values = values._data
if isinstance(values, np.ndarray):
# TODO: We would rather not get here
values = type(self._data)(values, dtype=self.dtype)

attributes = self._get_attributes_dict()

if self.freq is not None:
if isinstance(values, (DatetimeArray, TimedeltaArray)):
if values.freq is None:
del attributes["freq"]

attributes["name"] = name
return type(self)._simple_new(values, **attributes)
return type(self)._simple_new(values, name=name)

# --------------------------------------------------------------------
# Set Operation Methods
Expand Down Expand Up @@ -804,7 +794,10 @@ def _fast_union(self, other, sort=None):
loc = right.searchsorted(left_start, side="left")
right_chunk = right.values[:loc]
dates = concat_compat((left.values, right_chunk))
return self._shallow_copy(dates)
result = self._shallow_copy(dates)
result._set_freq("infer")
# TODO: can we infer that it has self.freq?
return result
else:
left, right = other, self

Expand All @@ -816,7 +809,10 @@ def _fast_union(self, other, sort=None):
loc = right.searchsorted(left_end, side="right")
right_chunk = right.values[loc:]
dates = concat_compat((left.values, right_chunk))
return self._shallow_copy(dates)
result = self._shallow_copy(dates)
result._set_freq("infer")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could change _set_freq to return self to make this idiom more clear (also _set_freq should be marked in the doc-string as a mutating function)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off on this since _set_freq needs a bigger rethink, xref #31218.

All other comments addressed, i think

# TODO: can we infer that it has self.freq?
return result
else:
return left

Expand Down
45 changes: 16 additions & 29 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,13 @@

from pandas._libs import NaT, Period, Timestamp, index as libindex, lib, tslib as libts
from pandas._libs.tslibs import fields, parsing, timezones
from pandas._typing import Label
from pandas.util._decorators import cache_readonly

from pandas.core.dtypes.common import _NS_DTYPE, is_float, is_integer, is_scalar
from pandas.core.dtypes.dtypes import DatetimeTZDtype
from pandas.core.dtypes.missing import is_valid_nat_for_dtype

from pandas.core.arrays.datetimes import (
DatetimeArray,
tz_to_dtype,
validate_tz_from_dtype,
)
from pandas.core.arrays.datetimes import DatetimeArray, tz_to_dtype
import pandas.core.common as com
from pandas.core.indexes.base import Index, InvalidIndexError, maybe_extract_name
from pandas.core.indexes.datetimelike import DatetimeTimedeltaMixin
Expand All @@ -36,7 +32,16 @@ def _new_DatetimeIndex(cls, d):
if "data" in d and not isinstance(d["data"], DatetimeIndex):
# Avoid need to verify integrity by calling simple_new directly
data = d.pop("data")
result = cls._simple_new(data, **d)
if not isinstance(data, DatetimeArray):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is all this about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backward compat for old pickles

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this nicer. it is very hard to grok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls comment the cases

tz = d.pop("tz")
freq = d.pop("freq")
dta = DatetimeArray._simple_new(data, dtype=tz_to_dtype(tz), freq=freq)
else:
dta = data
for key in ["tz", "freq"]:
if key in d:
assert d.pop(key) == getattr(dta, key)
result = cls._simple_new(dta, **d)
else:
with warnings.catch_warnings():
# TODO: If we knew what was going in to **d, we might be able to
Expand Down Expand Up @@ -235,33 +240,15 @@ def __new__(
return subarr

@classmethod
def _simple_new(cls, values, name=None, freq=None, tz=None, dtype=None):
"""
We require the we have a dtype compat for the values
if we are passed a non-dtype compat, then coerce using the constructor
"""
if isinstance(values, DatetimeArray):
if tz:
tz = validate_tz_from_dtype(dtype, tz)
dtype = DatetimeTZDtype(tz=tz)
elif dtype is None:
dtype = _NS_DTYPE

values = DatetimeArray(values, freq=freq, dtype=dtype)
tz = values.tz
freq = values.freq
values = values._data

dtype = tz_to_dtype(tz)
dtarr = DatetimeArray._simple_new(values, freq=freq, dtype=dtype)
assert isinstance(dtarr, DatetimeArray)
def _simple_new(cls, values: DatetimeArray, name: Label = None):
assert isinstance(values, DatetimeArray), type(values)

result = object.__new__(cls)
result._data = dtarr
result._data = values
result.name = name
result._no_setting_name = False
# For groupby perf. See note in indexes/base about _index_data
result._index_data = dtarr._data
result._index_data = values._data
result._reset_identity()
return result

Expand Down
9 changes: 3 additions & 6 deletions pandas/core/indexes/timedeltas.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
""" implement the TimedeltaIndex """

from pandas._libs import NaT, Timedelta, index as libindex
from pandas._typing import Label
from pandas.util._decorators import Appender

from pandas.core.dtypes.common import (
Expand Down Expand Up @@ -154,7 +155,7 @@ def __new__(
if isinstance(data, TimedeltaArray) and freq is None:
if copy:
data = data.copy()
return cls._simple_new(data, name=name, freq=freq)
return cls._simple_new(data, name=name)

if isinstance(data, TimedeltaIndex) and freq is None and name is None:
if copy:
Expand All @@ -170,12 +171,8 @@ def __new__(
return cls._simple_new(tdarr, name=name)

@classmethod
def _simple_new(cls, values, name=None, freq=None, dtype=_TD_DTYPE):
# `dtype` is passed by _shallow_copy in corner cases, should always
# be timedelta64[ns] if present
assert dtype == _TD_DTYPE, dtype
def _simple_new(cls, values: TimedeltaArray, name: Label = None):
assert isinstance(values, TimedeltaArray)
assert freq is None or values.freq == freq

result = object.__new__(cls)
result._data = values
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/arrays/test_datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ def test_compare_len1_raises(self):
# to the case where one has length-1, which numpy would broadcast
data = np.arange(10, dtype="i8") * 24 * 3600 * 10 ** 9

idx = self.array_cls._simple_new(data, freq="D")
arr = self.index_cls(idx)
arr = self.array_cls._simple_new(data, freq="D")
idx = self.index_cls(arr)

with pytest.raises(ValueError, match="Lengths must match"):
arr == arr[:1]
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/datetimes/test_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,7 @@ def test_equals(self):
assert not idx.equals(pd.Series(idx2))

# same internal, different tz
idx3 = pd.DatetimeIndex._simple_new(idx.asi8, tz="US/Pacific")
idx3 = pd.DatetimeIndex(idx.asi8, tz="US/Pacific")
tm.assert_numpy_array_equal(idx.asi8, idx3.asi8)
assert not idx.equals(idx3)
assert not idx.equals(idx3.copy())
Expand Down
24 changes: 17 additions & 7 deletions pandas/tseries/offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -1155,7 +1155,12 @@ def apply_index(self, i):
shifted = liboffsets.shift_months(i.asi8, self.n, self._day_opt)
# TODO: going through __new__ raises on call to _validate_frequency;
# are we passing incorrect freq?
return type(i)._simple_new(shifted, freq=i.freq, dtype=i.dtype)
dta = i
if not isinstance(i._data, np.ndarray):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WAT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compat for getting either a DTA or a DTI, we already do this in a couple places. I think we can do it just once by putting it in the apply_index_wraps decorator

# DTA
dta = i._data

return type(dta)._simple_new(shifted, freq=dta.freq, dtype=dta.dtype)


class MonthEnd(MonthOffset):
Expand Down Expand Up @@ -1885,9 +1890,12 @@ def apply_index(self, dtindex):
)
# TODO: going through __new__ raises on call to _validate_frequency;
# are we passing incorrect freq?
return type(dtindex)._simple_new(
shifted, freq=dtindex.freq, dtype=dtindex.dtype
)
dta = dtindex
if not isinstance(dtindex._data, np.ndarray):
# DTA
dta = dtindex._data

return type(dta)._simple_new(shifted, freq=dta.freq, dtype=dta.dtype)


class BQuarterEnd(QuarterOffset):
Expand Down Expand Up @@ -1971,9 +1979,11 @@ def apply_index(self, dtindex):
)
# TODO: going through __new__ raises on call to _validate_frequency;
# are we passing incorrect freq?
return type(dtindex)._simple_new(
shifted, freq=dtindex.freq, dtype=dtindex.dtype
)
dta = dtindex
if not isinstance(dtindex._data, np.ndarray):
# DTA
dta = dtindex._data
return type(dta)._simple_new(shifted, freq=dta.freq, dtype=dta.dtype)

def is_on_offset(self, dt: datetime) -> bool:
if self.normalize and not _is_normalized(dt):
Expand Down