-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix #17965 datetime64 comparison #18188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
def test_comparison_to_datetime64(self): | ||
# GH issue #17965 | ||
df = DataFrame({'dates': date_range('2000-01-01', '2000-01-05')}) | ||
date = df['dates'].unique()[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right test? date
should be a datetime.date
, and I think this is a datetime64[ns]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
date is numpy.datetime64('2000-01-01T00:00:00.000000000')
, which is consistent with the output of pd.date_range().values
, which are datetime64[ns]
too. Do you suggest this should be an array of datetime.date
objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#17965 is about comparing with a single datetime.date
object:
In [20]: pd.Timestamp('20130101').to_pydatetime().date()>=df.A
ValueError: cannot set a Timestamp with a non-timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paramaterize this test with date, datetime, np.datetime64, and Timestamp
.
check for < and ==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True! I was not actually fixing the Issue, but rather this comment: #17965 (comment) - but it actually applies to both.
Python is inconsistent here: python Thu Nov 9 05:53:21 2017
Python 3.6.1 (default, Apr 4 2017, 09:40:21)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> datetime.datetime(2017, 1, 1) == datetime.date(2017, 1, 1)
False
>>> datetime.datetime(2017, 1, 1) >= datetime.date(2017, 1, 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't compare datetime.datetime to datetime.date Strange.... We should be a bit clearer on our policy if we're differing from Python. Are we saying that |
I don't think your comparison is correct - semantically the test is comparing two |
python is inconcistent here, |
doc/source/whatsnew/v0.21.1.txt
Outdated
@@ -61,6 +61,7 @@ Bug Fixes | |||
- Bug in :class:`DatetimeIndex` subtracting datetimelike from DatetimeIndex could fail to overflow (:issue:`18020`) | |||
- Bug in ``pd.Series.rolling.skew()`` and ``rolling.kurt()`` with all equal values has floating issue (:issue:`18044`) | |||
- Bug in ``pd.DataFrameGroupBy.count()`` when counting over a datetimelike column (:issue:`13393`) | |||
- Bug in ``_libs.index.convert_scalar()`` when comparing datetimes to np.datetime64 (:issue:`17965`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a user useful note
pandas/_libs/index.pyx
Outdated
@@ -549,6 +549,8 @@ cpdef convert_scalar(ndarray arr, object value): | |||
if arr.descr.type_num == NPY_DATETIME: | |||
if isinstance(value, np.ndarray): | |||
pass | |||
if isinstance(value, np.datetime64): | |||
return Timestamp(value).value | |||
elif isinstance(value, datetime): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compbine np.datetime64 and datetime into the same clause
def test_comparison_to_datetime64(self): | ||
# GH issue #17965 | ||
df = DataFrame({'dates': date_range('2000-01-01', '2000-01-05')}) | ||
date = df['dates'].unique()[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paramaterize this test with date, datetime, np.datetime64, and Timestamp
.
check for < and ==
# GH issue #17965 | ||
df = DataFrame({'dates': date_range('2000-01-01', '2000-01-05')}) | ||
date = df['dates'].unique()[0] | ||
tm.assert_frame_equal(df.loc[df['dates'] == date], df.iloc[:1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use
result =
expected =
tm.assert_frame_equal(result, expected)
@@ -330,3 +330,9 @@ def test_loc_datetime_length_one(self): | |||
|
|||
result = df.loc['2016-10-01T00:00:00':] | |||
tm.assert_frame_equal(result, df) | |||
|
|||
def test_comparison_to_datetime64(self): | |||
# GH issue #17965 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the frame from the original issue (which uses a NaT) as well.
@watercrossing the comparison is a non-converted testing against a column of |
Reiterating suggestion to coerce datetime.date to Period with freq=D |
not in this PR maybe at some point |
8b01fe3
to
ddfe7dc
Compare
doc/source/whatsnew/v0.21.1.txt
Outdated
@@ -61,6 +61,7 @@ Bug Fixes | |||
- Bug in :class:`DatetimeIndex` subtracting datetimelike from DatetimeIndex could fail to overflow (:issue:`18020`) | |||
- Bug in ``pd.Series.rolling.skew()`` and ``rolling.kurt()`` with all equal values has floating issue (:issue:`18044`) | |||
- Bug in ``pd.DataFrameGroupBy.count()`` when counting over a datetimelike column (:issue:`13393`) | |||
- Bug in ``_libs.index.convert_scalar()``: Comparison between ``datetime64[ns]`` series and ``datetimelike`` (:issue:`17965`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't show internal functions here, rather the end-user effect, something like
a boolean comparison with a datetime.date and a datetime64[ns] dtype Series
comparisonTS = Timestamp('20130101') | ||
|
||
@pytest.mark.parametrize('datetimelike,opExp', product( | ||
[comparisonTS, comparisonTS.to_pydatetime(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly construct these, e.g. Timestamp('20130101'), datetime.datetime(2013, 1, 1)
etc
@pytest.mark.parametrize('datetimelike,opExp', product( | ||
[comparisonTS, comparisonTS.to_pydatetime(), | ||
comparisonTS.to_pydatetime().date(), comparisonTS.to_datetime64()], | ||
[(op.lt, [True, False, False, False]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't pass opExp
, rather pass op
(e.g. the op.lt) and the expected
(the boolean result)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do, but i am not sure something like:
@pytest.mark.parametrize('datetimelike,op,expected',
[(x, y[0], y[1]) for x, y in product(
[pd.Timestamp('20130101'), datetime.datetime(2013, 1, 1),
datetime.date(2013, 1, 1), np.datetime64('2013-01-01', 'ns')],
[(op.lt, [True, False, False, False]),
(op.le, [True, True, False, False]),
(op.eq, [False, True, False, False]),
(op.gt, [False, False, False, True])])])
is more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry didn't see the product. instead use multiple paramatrize to achieve product.
eg..
@pytest.mark.parametrize(.....)
@pytest.mark.parametrize('datetimelike'
rebase as well |
ddfe7dc
to
ebe805b
Compare
Codecov Report
@@ Coverage Diff @@
## master #18188 +/- ##
==========================================
- Coverage 91.42% 91.38% -0.05%
==========================================
Files 163 163
Lines 50068 50068
==========================================
- Hits 45777 45756 -21
- Misses 4291 4312 +21
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18188 +/- ##
==========================================
+ Coverage 91.38% 91.38% +<.01%
==========================================
Files 164 164
Lines 49884 49884
==========================================
+ Hits 45584 45586 +2
+ Misses 4300 4298 -2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small doc-change. ping on green.
doc/source/whatsnew/v0.21.1.txt
Outdated
@@ -62,6 +62,7 @@ Bug Fixes | |||
- Bug in ``pd.Series.rolling.skew()`` and ``rolling.kurt()`` with all equal values has floating issue (:issue:`18044`) | |||
- Bug in ``pd.DataFrameGroupBy.count()`` when counting over a datetimelike column (:issue:`13393`) | |||
- Bug in ``pd.concat`` when empty and non-empty DataFrames or Series are concatenated (:issue:`18178` :issue:`18187`) | |||
- Bug in a boolean comparison with a ``datetime.datetime``, ``datetime.date``, ``np.datetime64`` and a ``datetime64[ns]`` dtype Series (:issue:`17965`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just list datetime.datetime
(as that's the actual bug)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to the indexing section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Done.
ok ping on green. |
8b98651
to
d9f12f4
Compare
@jreback: I have changed the whatsnew and rebased. |
@watercrossing let us know if / when you see all the CI workers finish and we'll merge. |
@TomAugspurger optimist! |
@TomAugspurger / @jreback: Ping! |
thanks @watercrossing |
…pandas-dev#18188) (cherry picked from commit 77f10f0)
git diff upstream/master -u -- "*.py" | flake8 --diff