Skip to content

BUG/API: to_timedelta unit-argument ignored for string input #34379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 53 commits into from

Conversation

pdnm
Copy link
Contributor

@pdnm pdnm commented May 26, 2020

Supersedes #23025

Victor added 30 commits October 7, 2018 16:06
…edelta2

# Conflicts:
#	doc/source/whatsnew/v0.24.0.txt
#	pandas/_libs/tslibs/timedeltas.pxd
#	pandas/_libs/tslibs/timedeltas.pyx
#	pandas/tests/indexes/timedeltas/test_ops.py
…edelta2

# Conflicts:
#	pandas/_libs/tslibs/timedeltas.pyx
@pdnm pdnm force-pushed the bugfix/string-timedelta2 branch from 7041bce to 5a0812c Compare May 26, 2020 04:18
@@ -1,5 +1,7 @@
from numpy cimport int64_t

# Exposed for tslib, not intended for outside use.
cpdef parse_timedelta_string(object ts, object specified_unit=*)
cdef int64_t cast_from_unit(object ts, object unit) except? -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been moved to tslibs.conversion, so should be edited there rather than re-introduced here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel I fixed that and some other issues

@pdnm pdnm changed the title Bugfix/string timedelta2 [WIP] Bugfix/string timedelta2 May 27, 2020
@pdnm pdnm force-pushed the bugfix/string-timedelta2 branch from c4c4ec6 to 507ff0f Compare May 29, 2020 05:48
@pep8speaks
Copy link

pep8speaks commented May 29, 2020

Hello @pdnm! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-06-01 09:03:19 UTC

@pdnm pdnm force-pushed the bugfix/string-timedelta2 branch 7 times, most recently from 5cf2649 to bad005c Compare June 1, 2020 08:38
@pdnm pdnm changed the title [WIP] Bugfix/string timedelta2 Bugfix/string timedelta2 Jun 1, 2020
Fix linting issues, change single quotes to double quotes, add 'ns' to
timedelta string.
@pdnm pdnm force-pushed the bugfix/string-timedelta2 branch from bad005c to cd17104 Compare June 1, 2020 09:03
@pdnm pdnm requested a review from jbrockmendel June 1, 2020 10:12
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be handled at the entry points for users. we simply DO not allow a unit when the input is a string, full stop. We do this in to_datetime, please have a look; we would want to replicate this behavior.

@@ -701,6 +701,8 @@ Deprecations
instead (:issue:`34191`).
- The ``squeeze`` keyword in the ``groupby`` function is deprecated and will be removed in a future version (:issue:`32380`)

- A timedelta passed a number string without a defined unit is deprecated (:issue:`12136`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you reference :class:`Timedelta` and pandas.to_timedelta here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we deprecating this? i would simply remove it as its wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback By "we simply DO not allow a unit when the input is a string"? Do you refer to the unit inside the string or the unit parameter? This change behaviors are like this:

I simply merged @Sup3rGeo's PR #23025 without changing any behaviors. Please let me know if you want something to be changed.

Copy link
Contributor

@jreback jreback Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we spent a long time on the PR but was ultimately not merged.because of the way it was structured. Please re-read those comments. We want to handle this at a much higher level, if you pass a unit then your input must be numeric, there is no other option. I don't think PR needs to be very complicated at all.

@@ -371,6 +369,17 @@ cdef inline int64_t parse_timedelta_string(str ts) except? -1:
have_value = 1
have_dot = 0

# Consider units from outside
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we doing the checking at this low level? this is simply a check on the user interface, e.g. if its a string, then no unit can be specified.

see how we do this in to_datetime

else:
if have_value:
raise ValueError("have leftover units")
if len(number):
r = timedelta_from_spec(number, frac, 'ns')
result += timedelta_as_neg(r, neg)
warnings.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?

@jreback jreback added Error Reporting Incorrect or improved errors from pandas Timedelta Timedelta data type labels Jun 1, 2020
@jreback jreback changed the title Bugfix/string timedelta2 BUG/API: to_timedelta unit-argument ignored for string input Jun 1, 2020
@@ -174,6 +175,8 @@ cdef convert_to_timedelta64(object ts, object unit):

Return an ns based int64
"""
if unit is None:
unit = 'ns'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this called from? could we type the argument as str and require the caller to pass it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. in array_to_timedelta64 below we should just do that check once at the top of the function, not in every step

@@ -260,7 +258,7 @@ def array_to_timedelta64(object[:] values, unit='ns', errors='raise'):
return iresult.base # .base to access underlying np.ndarray


cdef inline int64_t parse_timedelta_string(str ts) except? -1:
cpdef inline parse_timedelta_string(object ts, specified_unit=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we really want to avoid weakening the type declarations


# Preserve unit if None, will be cast to nanoseconds
# later on at the proper functions
if unit is None or unit == 'M':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double quotes

@@ -200,7 +200,7 @@ def stringify(value):
v = v.tz_convert("UTC")
return TermValue(v, v.value, kind)
elif kind == "timedelta64" or kind == "timedelta":
v = Timedelta(v, unit="s").value
v = Timedelta(v).value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what type is "v" here? if it is an integer, this is a behavior change

@@ -108,7 +108,7 @@ def to_timedelta(arg, unit="ns", errors="raise"):
return _coerce_scalar_to_timedelta_type(arg, unit=unit, errors=errors)


def _coerce_scalar_to_timedelta_type(r, unit="ns", errors="raise"):
def _coerce_scalar_to_timedelta_type(r, unit=None, box=True, errors="raise"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding an unused box kwarg?

@@ -535,7 +535,7 @@ def test_resolution_deprecated(self):
(Timedelta(0, unit="ns"), False),
(Timedelta(-10, unit="ns"), True),
(Timedelta(None), True),
(NaT, True),
(pd.NaT, True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls revert this

@pdnm
Copy link
Contributor Author

pdnm commented Jun 8, 2020

Closing in favor of #34634

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG/API: to_timedelta unit-argument ignored for string input
4 participants