-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix TypeError for invalid integer dates %Y%m%d with errors='ignore' #26585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix TypeError for invalid integer dates %Y%m%d with errors='ignore' #26585
Conversation
Hello @nathalier! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-06-01 20:03:56 UTC |
Codecov Report
@@ Coverage Diff @@
## master #26585 +/- ##
==========================================
- Coverage 91.84% 91.83% -0.01%
==========================================
Files 174 174
Lines 50644 50644
==========================================
- Hits 46516 46511 -5
- Misses 4128 4133 +5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26585 +/- ##
==========================================
- Coverage 91.84% 91.83% -0.01%
==========================================
Files 174 174
Lines 50644 50644
==========================================
- Hits 46515 46511 -4
- Misses 4129 4133 +4
Continue to review full report at Codecov.
|
38a65a6
to
1e97e8b
Compare
@@ -140,13 +140,13 @@ def array_strptime(object[:] values, object fmt, | |||
iresult[i] = NPY_NAT | |||
continue | |||
raise ValueError("time data %r does not match " | |||
"format %r (match)" % (values[i], fmt)) | |||
"format %r (match)" % (val, fmt)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind converting these to use format
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just tried to do it but a couple of tests failed. I don't know the reason yet.
Minor comment, needs a whatsnew, but generally LGTM. |
480df1b
to
458b5cb
Compare
doc/source/whatsnew/v0.25.0.rst
Outdated
@@ -427,6 +427,7 @@ Datetimelike | |||
- Bug in :class:`Series` and :class:`DataFrame` repr where ``np.datetime64('NaT')`` and ``np.timedelta64('NaT')`` with ``dtype=object`` would be represented as ``NaN`` (:issue:`25445`) | |||
- Bug in :func:`to_datetime` which does not replace the invalid argument with ``NaT`` when error is set to coerce (:issue:`26122`) | |||
- Bug in adding :class:`DateOffset` with nonzero month to :class:`DatetimeIndex` would raise ``ValueError`` (:issue:`26258`) | |||
- Bug in :func:`to_datetime` which raises a ``TypeError`` when called with a too long for a given format integer date with ``errors='ignore'`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you say this is for %Y%m%d
format
also not clear about 'too long', maybe length of the integer is != 8
[20129930, 20129930], | ||
[20120011, 20120011], | ||
[2012010101, 2012010101]]) | ||
def test_int_to_datetime_format_YYYYMMDD_typeerror(self, int_date, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you try with a too short as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jreback,
Thanks for feedback.
I reworded whatsnew message and added a couple of short dates.
Also I found result can be unpredictable for some short dates (either in integer or string format) with length > 4.
pd.to_datetime(12912, format="%Y%m%d", errors='ignore')
Out[4]: 12912
pd.to_datetime(21213, format="%Y%m%d", errors='ignore')
Out[5]: datetime.datetime(2, 12, 13, 0, 0)
Nevertheless it's not connected to this TypeError.
I'll add these examples to corresponding issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is weird, these should for sure fail, but maybe going down an odd path. ok can address in another issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
458b5cb
to
d93d835
Compare
…re' (# GH 26583) array_strptime returned TypeError when trying to slice 'too long' integer for the given format %Y%m%d (for example 2121010101). After parsing date in the first 8 symbols it tried to return the remaining symbols in ValueError message as a slice of integer which in turn caused TypeError. Converted to string value is now used to make slice for that ValueError message. In case of 20209911, it tried to parse 20209911 to datetime(2020, 9, 9) and had 11 unparsed.
d93d835
to
11eddeb
Compare
@nathalier you are adding the short digits to an issue? (not this one that is being closed, correct)? if so we can merge this. |
Thanks @nathalier ! |
git diff upstream/master -u -- "*.py" | flake8 --diff
array_strptime returned TypeError when trying to slice 'too long' integer for the given format %Y%m%d (for example 2121010101).
After parsing date in the first 8 symbols it tried to return the remaining symbols in ValueError message as a slice of integer which in turn caused TypeError.
Converted to string value is now used to make slice for that ValueError message.
In case of 20209911, it tried to parse 20209911 to datetime(2020, 9, 9) and had 11 left unparsed.