-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame.replace with out of bound datetime causing RecursionError #22108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: DataFrame.replace with out of bound datetime causing RecursionError #22108
Conversation
Codecov Report
@@ Coverage Diff @@
## master #22108 +/- ##
==========================================
+ Coverage 92.06% 92.07% +0.01%
==========================================
Files 170 170
Lines 50705 50693 -12
==========================================
- Hits 46680 46675 -5
+ Misses 4025 4018 -7
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. can you add a whatsnew, bug fix in 0.24.0, reshaping
pandas/core/dtypes/cast.py
Outdated
values = lib.maybe_convert_objects(values, | ||
convert_datetime=datetime) | ||
except ValueError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment on when the valueerror is hit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented, used OutOfBoundsDatetime to be more clear on the error
datetime(2018, 7, 28), | ||
datetime(2018, 5, 28)])}), | ||
datetime(2018, 5, 28), datetime(2018, 7, 28), | ||
DataFrame({'datetime64': Index([datetime(2018, 7, 28)] * 3)})), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a datetime64 w/tz here as well (if its an easy add)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! 👍
@jreback items actioned. additional feedback welcome. |
thanks @minggli keep me coming! |
Thanks for reviewing! |
git diff upstream/master -u -- "*.py" | flake8 --diff
By default, keyword convert=True for replace method for Block and ObjectBlock. When trying to replace dataframe, blocks are operated on separately, during which conversion happens if convert=True. OutOfBoundsDatetime(inherited from ValueError) raised by lib.maybe_convert_objects caused Block.replace, ObjectBlock.replace to form infinite recursive loop.
As mentioned in #20380, setting convert=False (private and not available to public) solves the issue at hand, but there appear to be other uses cases that expect it to be True by default. Right now, simply wrap a try...except block around the lib.maybe_convert_objects in cast module. Downside is it catches all ValueError in maybe_convert_objects silently.