Skip to content

BUG: Enforce parse_dates as bool when scalar #12915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.18.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ API changes

- ``Series.apply`` for category dtype now applies passed function to each ``.categories`` (not ``.codes``), and returns "category" dtype if possible (:issue:`12473`)

- ``read_csv`` will now raise a ``TypeError`` if ``parse_dates`` is neither a boolean, list, or dictionary (:issue:`5636`)

- The default for ``.query()/.eval()`` is now ``engine=None``, which will use ``numexpr`` if it's installed; otherwise it will fallback to the ``python`` engine. This mimics the pre-0.18.1 behavior if ``numexpr`` is installed (and which Previously, if numexpr was not installed, ``.query()/.eval()`` would raise). (:issue:`12749`)

Expand Down
24 changes: 23 additions & 1 deletion pandas/io/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -825,6 +825,27 @@ def _validate_usecols_arg(usecols):
return usecols


def _validate_parse_dates_arg(parse_dates):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are adding new code but not checking for existing code, e.g. see like 298. Further let's go ahead and validate all of the cases (and test whatever is not tested), e.g. list/dict are valid (but need mainly to tests what is not accepted)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. can accept list-of-(lists, ints, strings), or dict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing code and logic does not validate IINM. For example, line 298 is about overriding parse_dates if it is a boolean and date_parser is passed.

Copy link
Member Author

@gfyoung gfyoung Apr 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a separate note, to what extent should this validation go? Is it just type checking? Or are we also validating element types? Are mixed types allowed? I presume a mixture of int and string is not, but what about int and list for example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you are check scalars, just check that it is a list or dict, that would be good enough for now. prob don't have a tests with passing something that is not a list/doc/scalar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""
Check whether or not the 'parse_dates' parameter
is a non-boolean scalar. Raises a ValueError if
that is the case.
"""
msg = ("Only booleans, lists, and "
"dictionaries are accepted "
"for the 'parse_dates' parameter")

if parse_dates is not None:
if lib.isscalar(parse_dates):
if not lib.is_bool(parse_dates):
raise TypeError(msg)

elif not isinstance(parse_dates, (list, dict)):
raise TypeError(msg)

return parse_dates


class ParserBase(object):

def __init__(self, kwds):
Expand All @@ -836,7 +857,8 @@ def __init__(self, kwds):
self.index_names = None
self.col_names = None

self.parse_dates = kwds.pop('parse_dates', False)
self.parse_dates = _validate_parse_dates_arg(
kwds.pop('parse_dates', False))
self.date_parser = kwds.pop('date_parser', None)
self.dayfirst = kwds.pop('dayfirst', False)
self.keep_date_col = kwds.pop('keep_date_col', False)
Expand Down
Loading