Skip to content

BUG: args offset in Timestamp.__new__ causes bugs #52344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
9371f22
BUG: reformulate the predicate
AlexKirko Mar 26, 2023
b7b37ad
TST: add the test
AlexKirko Mar 26, 2023
e97541d
REFACT: mv from-positional inside the if
AlexKirko Mar 26, 2023
b3282a5
DOC: add whatsnew
AlexKirko Mar 26, 2023
6ee4770
attempt to fix the core issue instead
AlexKirko Mar 27, 2023
e0699ba
BUG: mv the bugfix to decorator to allow for mixed arguments
AlexKirko Mar 28, 2023
ebd419e
TST: update the tests now that behavior is consistent
AlexKirko Mar 28, 2023
870dc40
TST: add the argument mix test
AlexKirko Mar 28, 2023
635e0cc
CLN: clean up the function a bit
AlexKirko Mar 28, 2023
c942b70
Merge branch 'main' into gh-51117-positional-fold-bug
AlexKirko Mar 29, 2023
7e369a9
Merge branch 'main' into gh-52343-timestamp-from-positional
AlexKirko Apr 1, 2023
09fcdec
BUG: initial implementation
AlexKirko Apr 1, 2023
f3c7731
basic implementation that passes tests
AlexKirko Apr 1, 2023
831597b
PERF: reasonable performance improvements
AlexKirko Apr 1, 2023
77ae64a
PERF: more reasonable optimizations
AlexKirko Apr 1, 2023
1d77bb5
PERF: more performance optimizations
AlexKirko Apr 2, 2023
e292b47
TST: add kludge to pass the legacy read_pickle test
AlexKirko Apr 2, 2023
5da6b8d
CLN: return mistakenly deleted newlines
AlexKirko Apr 3, 2023
9aa0187
DOC: rm kwarg/arg mixing warning
AlexKirko Apr 4, 2023
6257147
PERF: shortcut the positional arguments check for a 10% performance gain
AlexKirko Apr 4, 2023
162bf90
DOC: cover the second bug in whatsnew
AlexKirko Apr 5, 2023
a4682bb
Merge branch 'main' into gh-52343-timestamp-from-positional
AlexKirko Apr 5, 2023
5ddc6fb
CLN: remove unnecessary comments
AlexKirko Apr 5, 2023
9b821e5
ENH: prohibit invalid kwargs being passed with a date string
AlexKirko Apr 8, 2023
0382fcd
ENH: add check for arg kwarg conflict for the positional case
AlexKirko Apr 8, 2023
f355aaf
Merge branch 'main' into gh-52343-timestamp-from-positional
AlexKirko Apr 8, 2023
2811a1c
accept both changes
AlexKirko May 11, 2023
6fd531c
accept both changes to whatsnew
AlexKirko Jun 25, 2023
837c625
accept both versions
AlexKirko Aug 11, 2023
b4c389e
Merge branch 'main' into gh-52343-timestamp-from-positional
AlexKirko Aug 30, 2023
f3b53ad
Merge branch 'main' into gh-52343-timestamp-from-positional
AlexKirko Sep 7, 2023
acefe47
CLN: black fixes
AlexKirko Sep 7, 2023
80b2b69
Merge branch 'main' into gh-52343-timestamp-from-positional
AlexKirko Sep 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions asv_bench/benchmarks/tslibs/timestamp.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ def time_from_datetime_aware(self):
def time_from_pd_timestamp(self):
Timestamp(self.ts)

def time_from_positional(self):
Timestamp(2020, 1, 1, 0, 0, 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An additional benchmark for future regressions might be helpful.



class TimestampProperties:
params = [_tzs]
Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ Datetimelike
- Bug in :meth:`Timestamp.round` with values close to the implementation bounds returning incorrect results instead of raising ``OutOfBoundsDatetime`` (:issue:`51494`)
- :meth:`arrays.DatetimeArray.map` can now take a ``na_action`` argument. :meth:`DatetimeIndex.map` with ``na_action="ignore"`` now works as expected. (:issue:`51644`)
- Bug in :meth:`arrays.DatetimeArray.map` and :meth:`DatetimeIndex.map`, where the supplied callable operated array-wise instead of element-wise (:issue:`51977`)
- Bug in :class:`Timestamp` raising an error when passing fold when constructing from positional arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put "fold" in quotes/ticks

-

Timedelta
Expand Down
133 changes: 75 additions & 58 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1520,31 +1520,14 @@ class Timestamp(_Timestamp):
"""
return cls(datetime.combine(date, time))

def __new__(
cls,
object ts_input=_no_input,
year=None,
month=None,
day=None,
hour=None,
minute=None,
second=None,
microsecond=None,
tzinfo_type tzinfo=None,
*,
nanosecond=None,
tz=None,
unit=None,
fold=None,
):
def __new__(cls, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this get rendered in the docs somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexKirko any idea about this?

# The parameter list folds together legacy parameter names (the first
# four) and positional and keyword parameter names from pydatetime.
#
# There are three calling forms:
#
# - In the legacy form, the first parameter, ts_input, is required
# and may be datetime-like, str, int, or float. The second
# parameter, offset, is optional and may be str or DateOffset.
# and may be datetime-like, str, int, or float.
#
# - ints in the first, second, and third arguments indicate
# pydatetime positional arguments. Only the first 8 arguments
Expand All @@ -1553,17 +1536,81 @@ class Timestamp(_Timestamp):
# check that the second argument is an int.
#
# - Nones for the first four (legacy) arguments indicate pydatetime
# keyword arguments. year, month, and day are required. As a
# shortcut, we just check that the first argument was not passed.
#
# Mixing pydatetime positional and keyword arguments is forbidden!
# keyword arguments. year, month, and day are required. We just
# check that no positional arguments were passed.

cdef:
object ts_input=_no_input
_TSObject ts
tzinfo_type tzobj
tzinfo_type tzinfo, tzobj

_date_attributes = [year, month, day, hour, minute, second,
microsecond, nanosecond]
args_len = len(args)

# GH 30543 if pd.Timestamp already passed, return it
# check that only ts_input is passed
# checking verbosely, because cython doesn't optimize
# list comprehensions (as of cython 0.29.x)
if args_len == 1 and len(kwargs) == 0 and isinstance(args[0], _Timestamp):
return args[0]

# Check that kwargs weren't passed as args
# if args_len > 9:
# raise ValueError(invalid_args_msg)

# Unpack the arguments
# Building from ts_input
if args_len == 1:
if kwargs:
if ("year" in kwargs or "month" in kwargs or "day" in kwargs or
"hour" in kwargs or "minute" in kwargs or "second" in kwargs or
"microsecond" in kwargs):
raise ValueError("Cannot pass a date attribute keyword argument")
if isinstance(args[0], str) and "nanosecond" in kwargs:
raise ValueError(
"Cannot pass a date attribute keyword "
"argument when passing a date string; 'tz' is keyword-only"
)

ts_input = args[0]
tzinfo = kwargs.get("tzinfo")
# Building from positional arguments
elif 9 > args_len > 2 and isinstance(args[1], int):
args = args + (None,) * (8 - args_len)
year, month, day, hour, minute, second, microsecond, tzinfo = args

if kwargs:
# Positional or keyword arguments
hour = kwargs.get("hour", hour)
minute = kwargs.get("minute", minute)
second = kwargs.get("second", second)
microsecond = kwargs.get("microsecond", microsecond)
tzinfo = kwargs.get("tzinfo", tzinfo)
# Keywords only
elif args_len == 0:
ts_input = kwargs.get("ts_input", _no_input)
year = kwargs.get("year")
month = kwargs.get("month")
day = kwargs.get("day")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there anyway to simply translate the args to kwargs near the top

this looks so fragile and error prone

hour = kwargs.get("hour")
minute = kwargs.get("minute")
second = kwargs.get("second")
microsecond = kwargs.get("microsecond")
tzinfo = kwargs.get("tzinfo")
# kludge for reading legacy pickle with read_pickle in test_pickle
elif (args_len == 3 and isinstance(args[0], int) and
(not isinstance(args[1], int) or not isinstance(args[2], int))):
ts_input = args[0]
tzinfo = args[-1]
else:
raise ValueError(
f"Invalid Timestamp arguments. args: {args}, kwargs: {kwargs}"
)

# Unpack keyword-only arguments
nanosecond = kwargs.get("nanosecond", 0)
tz = kwargs.get("tz")
unit = kwargs.get("unit")
fold = kwargs.get("fold")

if tzinfo is not None:
# GH#17690 tzinfo must be a datetime.tzinfo object, ensured
Expand Down Expand Up @@ -1600,27 +1647,7 @@ class Timestamp(_Timestamp):
if hasattr(ts_input, "fold"):
ts_input = ts_input.replace(fold=fold)

# GH 30543 if pd.Timestamp already passed, return it
# check that only ts_input is passed
# checking verbosely, because cython doesn't optimize
# list comprehensions (as of cython 0.29.x)
if (isinstance(ts_input, _Timestamp) and
tz is None and unit is None and year is None and
month is None and day is None and hour is None and
minute is None and second is None and
microsecond is None and nanosecond is None and
tzinfo is None):
return ts_input
elif isinstance(ts_input, str):
# User passed a date string to parse.
# Check that the user didn't also pass a date attribute kwarg.
if any(arg is not None for arg in _date_attributes):
raise ValueError(
"Cannot pass a date attribute keyword "
"argument when passing a date string; 'tz' is keyword-only"
)

elif ts_input is _no_input:
if ts_input is _no_input:
# GH 31200
# When year, month or day is not given, we call the datetime
# constructor to make sure we get the same error message
Expand All @@ -1641,14 +1668,6 @@ class Timestamp(_Timestamp):

ts_input = datetime(**datetime_kwargs)

elif is_integer_object(year):
# User passed positional arguments:
# Timestamp(year, month, day[, hour[, minute[, second[,
# microsecond[, tzinfo]]]]])
ts_input = datetime(ts_input, year, month, day or 0,
hour or 0, minute or 0, second or 0, fold=fold or 0)
unit = None

if getattr(ts_input, "tzinfo", None) is not None and tz is not None:
raise ValueError("Cannot pass a datetime or Timestamp with tzinfo with "
"the tz parameter. Use tz_convert instead.")
Expand All @@ -1659,9 +1678,7 @@ class Timestamp(_Timestamp):
# wall-time (consistent with DatetimeIndex)
return cls(ts_input).tz_localize(tzobj)

if nanosecond is None:
nanosecond = 0
elif not (999 >= nanosecond >= 0):
if not (999 >= nanosecond >= 0):
raise ValueError("nanosecond must be in 0..999")

ts = convert_to_tsobject(ts_input, tzobj, unit, 0, 0, nanosecond)
Expand Down
49 changes: 28 additions & 21 deletions pandas/tests/scalar/timestamp/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,7 @@
import pytz

from pandas._libs.tslibs.dtypes import NpyDatetimeUnit
from pandas.compat import (
PY39,
PY310,
)
from pandas.compat import PY39
from pandas.errors import OutOfBoundsDatetime

from pandas import (
Expand Down Expand Up @@ -293,18 +290,19 @@ def test_constructor_invalid(self):

def test_constructor_invalid_tz(self):
# GH#17690
msg = (
"Argument 'tzinfo' has incorrect type "
r"\(expected datetime.tzinfo, got str\)"
)
msg = "Cannot convert str to datetime.tzinfo"
with pytest.raises(TypeError, match=msg):
Timestamp("2017-10-22", tzinfo="US/Eastern")

msg = "at most one of"
with pytest.raises(ValueError, match=msg):
Timestamp("2017-10-22", tzinfo=pytz.utc, tz="UTC")

msg = "Cannot pass a date attribute keyword argument when passing a date string"
msg = (
"Invalid Timestamp arguments. "
"args: \\('2012-01-01', 'US/Pacific'\\), "
"kwargs: {}"
)
with pytest.raises(ValueError, match=msg):
# GH#5168
# case where user tries to pass tz as an arg, not kwarg, gets
Expand Down Expand Up @@ -340,12 +338,7 @@ def test_constructor_positional_with_tzinfo(self):

@pytest.mark.parametrize("kwd", ["nanosecond", "microsecond", "second", "minute"])
def test_constructor_positional_keyword_mixed_with_tzinfo(self, kwd, request):
# TODO: if we passed microsecond with a keyword we would mess up
# xref GH#45307
if kwd != "nanosecond":
# nanosecond is keyword-only as of 2.0, others are not
mark = pytest.mark.xfail(reason="GH#45307")
request.node.add_marker(mark)
# GH#52221 makes a mix of positional and keyword arguments behave consistently

kwargs = {kwd: 4}
ts = Timestamp(2020, 12, 31, tzinfo=timezone.utc, **kwargs)
Expand All @@ -357,12 +350,8 @@ def test_constructor_positional_keyword_mixed_with_tzinfo(self, kwd, request):

def test_constructor_positional(self):
# see gh-10758
msg = (
"'NoneType' object cannot be interpreted as an integer"
if PY310
else "an integer is required"
)
with pytest.raises(TypeError, match=msg):
msg = "Invalid Timestamp arguments. args: \\(2000, 1\\), kwargs: {}"
with pytest.raises(ValueError, match=msg):
Timestamp(2000, 1)

msg = "month must be in 1..12"
Expand Down Expand Up @@ -891,3 +880,21 @@ def test_timestamp_constructor_adjust_value_for_fold(tz, ts_input, fold, value_o
result = ts._value
expected = value_out
assert result == expected


@pytest.mark.parametrize("tz", ["dateutil/Europe/London"])
def test_timestamp_constructor_positional_with_fold(tz):
# Check that we build an object successfully
# if we pass positional arguments and fold
ts = Timestamp(2019, 10, 27, 1, 30, tz=tz, fold=0)
result = ts._value
expected = 1572136200000000
assert result == expected


def test_timestamp_constructor_arg_shift():
# Check that passing a positional argument as keyword
# does not change the value
result = Timestamp(2019, 10, 27, minute=30)
expected = Timestamp(2019, 10, 27, 0, 30)
assert result == expected