-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix datetimes.should_cache() error for deque (GH 29403) #29846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @nathalier! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-12-14 16:01:44 UTC |
) | ||
def test_no_slicing_errors_in_should_cache(listlike): | ||
# GH 29403 | ||
assert tools.should_cache(listlike) is True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, could you use the example in the original issues as the test? #29403 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mroeschke For this particular bug the example from the original issue could easily stop being indicative if the empirically chosen constants for caching are changed. Let's say if it increased from 50 to 55 or 70 or 100 then the original issue example won't run this code pass, whereas these tests will show that something changed as they would fail. Also they are faster than the original example. And as we definitely know what was the reason of the failure I think it's quite reasonable to use small and fast tests for should_cache() instead.
The original example can be added to ensure the deque works for datetimes. Would be grateful if you could advise to which test file I should add that. Can it be pandas/tests/arrays/test_datetimes.py ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mroeschke For this particular bug the example from the original issue could easily stop being indicative if the empirically chosen constants for caching are changed. Let's say if it increased from 50 to 55 or 70 or 100 then the original issue example won't run this code pass, whereas these tests will show that something changed as they would fail. Also they are faster than the original example. And as we definitely know what was the reason of the failure I think it's quite reasonable to use small and fast tests for should_cache() instead.
The original example can be added to ensure deque works for timeseries. Would be grateful if you could advise to which test file I should add that. Can it be pandas/tests/arrays/test_datetimes.py ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the threshold; we can keep the test you have and add the example in the original issue. I think the pandas/tests/arrays/test_datetimes.py
is a good place for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current location is good for this test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mroeschke I've found a very relevant test test_to_datetime_cache()
where I included deque in the list of constructors. Actually, with this test, tests I added before for the bug are not much needed any more. What do you think - should I drop them (test_no_slicing_errors_in_should_cache()
and test_to_datetime_from_deque()
) now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay keeping both tests. These tests run relatively quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, these do look quite duplicative, though these are hard-coded to the results which is fine.
can you co-located these tests near the above one.
pls run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a whatsnew note (bug fixes / datetimes), otherwise lgtm. ping on green.
b0a9658
to
ca98e77
Compare
@jbrockmendel Thanks a lot for help! |
@jreback Now it's green, but possibly some of added tests are redundant. |
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -679,6 +679,8 @@ Datetimelike | |||
- Bug in :meth:`Series.var` failing to raise ``TypeError`` when called with ``timedelta64[ns]`` dtype (:issue:`28289`) | |||
- Bug in :meth:`DatetimeIndex.strftime` and :meth:`Series.dt.strftime` where ``NaT`` was converted to the string ``'NaT'`` instead of ``np.nan`` (:issue:`29578`) | |||
- Bug in :attr:`Timestamp.resolution` being a property instead of a class attribute (:issue:`29910`) | |||
- Bug in :func:`pandas.core.tools.datetimes.should_cache` raising ``TypeError`` when trying to get a slice of a deque (:issue:`29403`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you reword to make more user facing? i.e. to_datetime
failing for deque
s when using the cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint. Reworded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments, lgtm. ping on green.
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -679,6 +679,7 @@ Datetimelike | |||
- Bug in :meth:`Series.var` failing to raise ``TypeError`` when called with ``timedelta64[ns]`` dtype (:issue:`28289`) | |||
- Bug in :meth:`DatetimeIndex.strftime` and :meth:`Series.dt.strftime` where ``NaT`` was converted to the string ``'NaT'`` instead of ``np.nan`` (:issue:`29578`) | |||
- Bug in :attr:`Timestamp.resolution` being a property instead of a class attribute (:issue:`29910`) | |||
- Bug in :func:`pandas.core.tools.datetimes.should_cache` failing for `deques` with ``TypeError`` when using the cache (:issue:`29403`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this to to_datetime
which is the user visibile function
when using cache=True
(the default)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Tests moved, whatsnew reworded. All tests have passed.
) | ||
def test_no_slicing_errors_in_should_cache(listlike): | ||
# GH 29403 | ||
assert tools.should_cache(listlike) is True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, these do look quite duplicative, though these are hard-coded to the results which is fine.
can you co-located these tests near the above one.
itertools.islice() should be used to get slice of a deque. itertools.islice() also can be used (and is efficient) for other collections. So unique(arg[:check_count]) was replaced with set(islice(arg, check_count))
thanks @nathalier |
itertools.islice()
should be used to get slice of a deque.itertools.islice()
also can be used (and is efficient) for other collections. Sounique(arg[:check_count])
was replaced withset(islice(arg, check_count))
which is also much faster on test data.black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff