BUG: Constrain date parsing from strings a little bit more #4601 #4863

danbirken · 2013-09-17T19:23:39Z

Currently dateutil will parse almost any string into a datetime. This
change adds a filter in front of dateutil that will prevent it from
parsing certain strings that don't look like datetimes:

Strings that parse to float values that are less than 1000
Certain special one character strings (this was already in there,
this just moves that code)

Additionally, this filters out datetimes that are out of range for the
datetime64[ns] type. Currently any out-of-range datetimes will just
overflow and be mapped to some random time within the bounds of
datetime64[ns].

danbirken · 2013-09-17T19:46:26Z

Yeah that looks like a much better place. Will do.

jreback · 2013-09-17T19:57:40Z

what does this do with an out-of-bounds date? raise? (or depends on coerce)?

jreback · 2013-09-17T19:58:29Z

i believe this should close #4065 as well?

jreback · 2013-09-17T22:35:19Z

@danbirken I realize there is a pandas.core.datetools (which is not that big)...should prob go there

danbirken · 2013-09-18T01:33:15Z

It treats an out of bounds date as an invalid date string, so it will either coerce it or not convert the array to datetimes. It doesn't throw an error.

As for the location, both of those other places were causing some circular dependency issues, and on second thought I don't think either of them are any improvement over just putting them in tslib.pyx (which is a file full of timestamp related functions anyway). I also think the tests being in test_tslib.py is an improvement over the previous change.

#4065 is similar, but this change wont fix it. This only fixes the specific case of parsing timestamps from strings (like csv parsing) -- previously out-of-bounds timestamps would just be mapped randomly within the appropriate bounds, whereas now they will be seen as invalid. I think fixing that one should be pretty easy though, I can do that after this one.

jreback · 2013-09-18T12:02:06Z

gr8 on both points

danbirken · 2013-09-19T01:18:51Z

Rebasing so it can be merged cleanly. Let me know if there is anything else I should add to get this merged.

jreback · 2013-09-19T02:04:20Z

can make tr function names a little shorter

maybe call the dunction parse_datetime_string?

and I think u can add kw argument, maybe validate to parse_datetime_string to call the verify bounds function (make this function name shorter too)

danbirken · 2013-09-20T21:21:47Z

Shortened the function names and added the kw param.

jtratner · 2013-09-20T21:33:57Z

Can you make sure all your test function names + open parens are under 80 characters in line length?

jtratner · 2013-09-20T21:35:25Z

If you feel the test needs a longer explanation, you can put it in comments below the function name.

…#4601 Currently dateutil will parse almost any string into a datetime. This change adds a filter in front of dateutil that will prevent it from parsing certain strings that don't look like datetimes: 1) Strings that parse to float values that are less than 1000 2) Certain special one character strings (this was already in there, this just moves that code) Additionally, this filters out datetimes that are out of range for the datetime64[ns] type. Currently any out-of-range datetimes will just overflow and be mapped to some random time within the bounds of datetime64[ns].

danbirken · 2013-09-20T21:51:59Z

Sorry I am used to 100 column style rules, my bad. Fixed everything to pep8 80 cols.

jtratner · 2013-09-20T22:19:33Z

totally reasonable - 100 character makes sense too, we just use 80 everywhere. Thanks for making the change! I'll leave it to @jreback for deciding to merge this.

jreback · 2013-09-20T22:34:26Z

@danbirken thanks for the fixes!

jreback merged commit f738712 into pandas-dev:master Sep 20, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Constrain date parsing from strings a little bit more #4601 #4863

BUG: Constrain date parsing from strings a little bit more #4601 #4863

danbirken commented Sep 17, 2013

danbirken commented Sep 17, 2013

jreback commented Sep 17, 2013

jreback commented Sep 17, 2013

jreback commented Sep 17, 2013

danbirken commented Sep 18, 2013

jreback commented Sep 18, 2013

danbirken commented Sep 19, 2013

jreback commented Sep 19, 2013

danbirken commented Sep 20, 2013

jtratner commented Sep 20, 2013

jtratner commented Sep 20, 2013

danbirken commented Sep 20, 2013

jtratner commented Sep 20, 2013

jreback commented Sep 20, 2013

BUG: Constrain date parsing from strings a little bit more #4601 #4863

BUG: Constrain date parsing from strings a little bit more #4601 #4863

Conversation

danbirken commented Sep 17, 2013

danbirken commented Sep 17, 2013

jreback commented Sep 17, 2013

jreback commented Sep 17, 2013

jreback commented Sep 17, 2013

danbirken commented Sep 18, 2013

jreback commented Sep 18, 2013

danbirken commented Sep 19, 2013

jreback commented Sep 19, 2013

danbirken commented Sep 20, 2013

jtratner commented Sep 20, 2013

jtratner commented Sep 20, 2013

danbirken commented Sep 20, 2013

jtratner commented Sep 20, 2013

jreback commented Sep 20, 2013