-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Accepts integer/float string with units and raises when unit is ambiguous #21384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @Sup3rGeo! Thanks for updating the PR.
Comment last updated on October 01, 2018 at 12:55 Hours UTC |
|
||
|
||
@pytest.mark.parametrize("redundant_unit, expectation", [ | ||
("", not_raises()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe pytest.raises(None)
should suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't - it is still an open issue in pytest: pytest-dev/pytest#1830
Do you have a better idea than to define this empty context manager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Odd that we can have an empty context manager with warnings assertions, but that's not implemented for exceptions. Implementing the empty context manager wouldn't super difficult on our end (we have pandas.util.testing.assert_raises_regex
).
Only question is how well-received such an idea would be.
@jreback @jorisvandenbossche : Thoughts?
Codecov Report
@@ Coverage Diff @@
## master #21384 +/- ##
=========================================
Coverage ? 91.89%
=========================================
Files ? 153
Lines ? 49596
Branches ? 0
=========================================
Hits ? 45576
Misses ? 4020
Partials ? 0
Continue to review full report at Codecov.
|
0.001, 1, 10]) | ||
def test_string_with_unit(num, sign, unit, redundant_unit, expectation): | ||
with expectation: | ||
assert Timedelta(str(sign * num) + redundant_unit, unit=unit)\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can't we somehow do multiline with an assert
statement without the slash (we generally don't use the slash when doing multiline in this repository)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well I could break the whole statement in two, moving the Timedelta first argument to a variable. This would reduce the line length. Should I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The not_raises
question notwithstanding (it's not a blocker unless we decide we want to do this), this looks good to me!
cc @jreback
doc/source/whatsnew/v0.23.1.txt
Outdated
@@ -59,6 +59,7 @@ Data-type specific | |||
|
|||
- Bug in :meth:`Series.str.replace()` where the method throws `TypeError` on Python 3.5.2 (:issue: `21078`) | |||
- Bug in :class:`Timedelta`: where passing a float with a unit would prematurely round the float precision (:issue: `14156`) | |||
- Bug in :class:`Timedelta`: where passing a string of a pure number would not take unit into account. Also raises for ambiguous/duplicate unit specification (:issue: `12136`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to 0.24.0.txt?
@@ -85,10 +85,6 @@ def test_construction(): | |||
with pytest.raises(ValueError): | |||
Timedelta('10 days -1 h 1.5m 1s 3us') | |||
|
|||
# no units specified | |||
with pytest.raises(ValueError): | |||
Timedelta('3.1415') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to allow this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't allow this, pls revert. a string w/o a unit must raise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, looking back at the issue #12136, there was discussion about this and said we would allow this. So that is a good reason this is PR is doing that! :-)
(but personally, I would simply not allow it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowing a string with a unit is fine
this does not have a unit
@pytest.mark.parametrize("redundant_unit, expectation", [ | ||
("", not_raises()), | ||
("d", pytest.raises(ValueError)), | ||
("us", pytest.raises(ValueError))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed to test that it is raising the value error with all the other combinations? (because this gives a combinatorial explosion of number of test .. ;-))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I just wanted to make sure kinda all cases were covered. I believe this does not give a lot of combinations, so I thought that for this case specifically it would be ok.
@@ -999,10 +1004,21 @@ class Timedelta(_Timedelta): | |||
if isinstance(value, Timedelta): | |||
value = value.value | |||
elif is_string_object(value): | |||
if len(value) > 0 and value[0] == 'P': | |||
# Check if it is just a number in a string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to support this? (do we now?)
@@ -245,7 +245,7 @@ cdef inline _decode_if_necessary(object ts): | |||
return ts | |||
|
|||
|
|||
cdef inline parse_timedelta_string(object ts): | |||
cdef inline parse_timedelta_string(object ts, specified_unit=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be handled here at all
@@ -999,10 +1004,21 @@ class Timedelta(_Timedelta): | |||
if isinstance(value, Timedelta): | |||
value = value.value | |||
elif is_string_object(value): | |||
if len(value) > 0 and value[0] == 'P': | |||
# Check if it is just a number in a string | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just try to float here. if it succeeds it must have a unit specified, if not you must raise, e.g. Timedelta('1000')
is not accepted
pandas/_libs/tslibs/timedeltas.pyx
Outdated
value = parse_iso_format_string(value) | ||
else: | ||
value = parse_timedelta_string(value) | ||
value = parse_timedelta_string(value, unit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no not pass here
@@ -85,10 +85,6 @@ def test_construction(): | |||
with pytest.raises(ValueError): | |||
Timedelta('10 days -1 h 1.5m 1s 3us') | |||
|
|||
# no units specified | |||
with pytest.raises(ValueError): | |||
Timedelta('3.1415') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't allow this, pls revert. a string w/o a unit must raise.
|
||
class not_raises(object): | ||
def __enter__(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this its not idiomatic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda needed it for parametrizing on the expectation. Should I then just add a if/else in the test to use or not pytest.raises?
("us", pytest.raises(ValueError))]) | ||
@pytest.mark.parametrize("unit", [ | ||
"d", "m", "s", "us"]) | ||
@pytest.mark.parametrize("sign", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are related tests for this pls move.
+1, -1]) | ||
@pytest.mark.parametrize("num", [ | ||
0.001, 1, 10]) | ||
def test_string_with_unit(num, sign, unit, redundant_unit, expectation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs equivalent handling for the to_timedelta
path (which is separate)
can you rebase and fixup |
can you rebase and update according to comments |
closing as stale, but would take if rebased / updated |
@jreback sorry for abandoning this, but I will start working back again to finish this PR, especially given that everyone has spent time to review it. Should I start another one or can I reply to the comments on this one? |
@Sup3rGeo : You can keep pushing to this one. I'll reopen. |
881c122
to
45dfd7c
Compare
Two points I would like to discuss: 1 - The behaviour so far would raise on 2 - If I raise on any string number without
|
You mean the current behaviour in master?
Yes, we used such a construct in one of the tests (in "pandas\tests\util\test_hashing.py:23: in TestHashing"), but you can correct this in the test. That said, we should probably deprecate it first instead of simply changing. |
@Sup3rGeo something else, there seem to be a bunch of unrelated changes in the diff. I suppose due something that went wrong in the rebase / update.
to see if that fixes it? |
Hi!
Sorry but what do you mean by deprecating it instead of changing?
Did it. Did it help? I think I screwed up with the rebase indeed. I could try starting fresh with a new branch if nothing helps. |
e92da2d
to
4a6ed2e
Compare
4a6ed2e
to
85f04c9
Compare
Apparently not .. I also tried to rebase, and removed the commits that were not yours. That seemed to have fixed some of the problems, but not all (there are still changes in So maybe starting from a fresh branch and copying over your changes there might be the easiest to do.
So currently |
Closing in favor of #23025 |
I couldn't get tests to run on my machine so I will see if it passes on the CI servers.
git diff upstream/master -u -- "*.py" | flake8 --diff