Skip to content

ERR: between_time now checks for argument types #11832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

rockg
Copy link
Contributor

@rockg rockg commented Dec 12, 2015

Closes #11818

@jreback
Copy link
Contributor

jreback commented Dec 13, 2015

do you think passing non-time like strings are also an error?

e.g.
2015-01-01 09:00 would be valid (previously and after this one). We could restrict to Timedelta parsables I think (not exactly the same thing but would work).

@jreback jreback added Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas labels Dec 13, 2015
@jreback jreback added this to the 0.18.0 milestone Dec 13, 2015
@rockg
Copy link
Contributor Author

rockg commented Dec 13, 2015

Unfortunately, I don't know how to check that. How it currently works is the dateutil parser is used to parse the time string but a datetime.datetime is returned regardless if it's a time string or date string.

>>> from dateutil.parser import parse
>>> parse("2015-12-13 10:10")
datetime.datetime(2015, 12, 13, 10, 10)
>>> parse("10:10")
datetime.datetime(2015, 12, 13, 10, 10)
>>> 

@rockg
Copy link
Contributor Author

rockg commented Dec 13, 2015

I see what your idea is but I'm just not sure that the Timedelta and dateutil parser are completely the same.

>>> Timedelta("10:10")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/tslib.pyx", line 2277, in pandas.tslib.Timedelta.__new__ (pandas/tslib.c:41085)
    value = np.timedelta64(parse_timedelta_string(value, False))
  File "pandas/tslib.pyx", line 2862, in pandas.tslib.parse_timedelta_string (pandas/tslib.c:49601)
    raise ValueError("expected hh:mm:ss format")
ValueError: expected hh:mm:ss format

@jreback
Copy link
Contributor

jreback commented Dec 13, 2015

the timedelta parser is a bit strick for this, can prob make a mini one by doing something like:

try:
    return time.strptime(s, '%H:%M:%S')
except:
    try:
        return time.strptime(s, '%H:%M')
    except:
        # rise here

@rockg
Copy link
Contributor Author

rockg commented Dec 13, 2015

Seems like we should also do a milliseconds parse for safe measure.

>>> parse("10:10:10.123")
datetime.datetime(2015, 12, 13, 10, 10, 10, 123000)

@jreback
Copy link
Contributor

jreback commented Dec 13, 2015

don't use dateutil.parse we want a time only

@rockg
Copy link
Contributor Author

rockg commented Dec 13, 2015

My only comment was that it's currently supported so we need that additionally.

@jreback
Copy link
Contributor

jreback commented Dec 13, 2015

easiest might be to loop thru a list of allowed formats until u match (or exhaust the list and raise)

using strptime is strict (which is good)

@rockg rockg force-pushed the between_time-exception branch 2 times, most recently from 09bfda3 to 2fef431 Compare December 20, 2015 12:37
@rockg
Copy link
Contributor Author

rockg commented Dec 20, 2015

I'm confused as to why this fails on 2.6. I created a 2.6 environment and the test works fine there.

Python 2.6.9 |Continuum Analytics, Inc.| (unknown, Aug 21 2014, 18:28:52) 
...
In [1]: from datetime import datetime

In [2]: datetime.strptime("2:00am", "%I:%M%p").time()
Out[2]: datetime.time(2, 0)

@rockg rockg force-pushed the between_time-exception branch from 2fef431 to b42e96b Compare December 20, 2015 14:43
@rockg
Copy link
Contributor Author

rockg commented Dec 20, 2015

Ah, it's because that test uses the am/pm from it_IT locale. Skipped if there is a locale present.

@@ -1769,13 +1782,34 @@ def indexer_between_time(self, start_time, end_time, include_start=True,
-------
values_between_time : TimeSeries
"""
from dateutil.parser import parse
def _parse_time_string(time_object):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u factor this routine out and out it in tseries/tools.py (eg next to the datetime parsing ones)

thrn just call it from here (u can also move the these to the corresponding testing module)

@rockg rockg force-pushed the between_time-exception branch from b42e96b to 4009405 Compare December 22, 2015 01:59
arg : compat.string_types
format : str, list-like, default None
Format used to convert arg into a time object. If None, default formats
are used. To add an additional format use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we allowing 'adding' time formats? your list seems like enough, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More for backwards compatibility. Our formats are a subset of dateutil's so want to ensure that the code still works for those that may have used a different format historically.

@rockg rockg force-pushed the between_time-exception branch 2 times, most recently from de83d7c to c764bbd Compare December 30, 2015 14:24
return _convert_listlike(arg, format)

return _convert_listlike(np.array([arg]), format)[0]
except:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't really like these blanket excepts. What hits this? why is it necessary. Rather catch these things inside _convert_listlike on a specific basis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to cover the "ignore" case for casting errors in which case arg is returned. I did it here as things are cast to a numpy array and so your original argument really isn't returned. However, for now I can remove that casting and then check for "ignore" when the casting fails and return the unmolested arg at that point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or I can just catch the ValueError and return arg at that point which might make more sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first soln is better - then if an uncaught error it will show up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way it is currently written it will raise any exceptions that occur within _convert_listlike so you won't have any uncaught errors.

@rockg rockg force-pushed the between_time-exception branch 2 times, most recently from 4b26ac9 to e480401 Compare December 31, 2015 01:36
@rockg
Copy link
Contributor Author

rockg commented Jan 2, 2016

Hopefully we are all good here.

@@ -205,7 +205,8 @@ other anchored offsets like ``MonthBegin`` and ``YearBegin``.
Other API Changes
^^^^^^^^^^^^^^^^^


- DataFrame between_time now only parses a fixed set of time strings. Parsing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double backticks DataFrame.between_time (though this also applies to Series, yes? I would put a mini-example here as well.

start_time : datetime.time or string
end_time : datetime.time or string
start_time, end_time : datetime.time or string
Time or string in appropriate format (e.g., "%H:%M", "%I%M%p")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datetime.time or string in appropriate time-like format (and include the formats you have above)

@jreback
Copy link
Contributor

jreback commented Jan 2, 2016

small comments.

pls run: git diff master | flake8 --diff and fix the PEP8 issues (this is going to become standard shortly).

this looks really good. I think we could expose this via pd.to_time. Though I we do need better time support. Pls create an issue so we can consider this in the future.

ping when green.

@rockg rockg force-pushed the between_time-exception branch 2 times, most recently from 33cc8f2 to eb827c4 Compare January 3, 2016 02:33
@rockg rockg force-pushed the between_time-exception branch from eb827c4 to eadc308 Compare January 3, 2016 03:26
@rockg
Copy link
Contributor Author

rockg commented Jan 3, 2016

@jreback Green. I feel more comfortable leaving the ability to add another format given that we do not know how much people were relying on dateutil's parser. It's two lines of code and innocuous.

@jreback
Copy link
Contributor

jreback commented Jan 3, 2016

merged via 9c71dbf

thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants