Skip to content

Improve date parsing #722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 20, 2020
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion jsonschema/_format.py
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,10 @@ def is_regex(instance):
def is_date(instance):
if not isinstance(instance, str):
return True
return datetime.datetime.strptime(instance, "%Y-%m-%d")
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but might as well -- it'll save doing the "wrong" thing on each call on Py36 if we do this only once rather than each time we validate.

I.e. if we move this into a try/except, and do:

if hasattr(datetime.date, "fromisoformat"):
     _is_date = datetime.date.fromisoformat
else:
    def _is_date(instance):
        return datetime.datetime.strptime(instance, "%Y-%m-%d")

def is_date(...):
    return _is_date(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully got what you were after here, but happy to take another crack at it if I messed up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have been less lazy in writing it out :) -- so the point is (which is true of your commit here too) -- as is, every time the is_date validation is applied, it checks to see which implementation should be used. So on Py36, that means you first try fromisoformat, then that fails, then you try the old way, but then you "forget" about the fact that it didn't work and the next time the function is called, you'll again try the fromisoformat, again find it doesn't work, etc.

Instead, what I was trying to say was to move the check outside of the is_date function entirely. In other words, globally at the top level in the module:

if hasattr(datetime.date, "fromisoformat"):
     _is_date = datetime.date.fromisoformat
else:
    def _is_date(instance):
        return datetime.datetime.strptime(instance, "%Y-%m-%d")

@_checks_drafts(draft3="date", draft7="date", raises=ValueError)
def is_date(instance):
    if not isinstance(instance, str):
        return True
    return _is_date(instance)

Now, we only check once on import which implementation should be used, and then if you call that function 1000 times, we've already figured out the right implementation ahead of time.

Does that make more sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, got it. If I'd looked around a bit more in that file I'd have seen you doing that kind of thing in other spots. This one should be resolved now.

return datetime.date.fromisoformat(instance)
except AttributeError:
return datetime.datetime.strptime(instance, "%Y-%m-%d")


@_checks_drafts(draft3="time", raises=ValueError)
Expand Down
42 changes: 22 additions & 20 deletions jsonschema/tests/test_jsonschema_test_suite.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,24 @@ def narrow_unicode_build(test): # pragma: no cover
return


if sys.version_info < (3, 7):
message = "Not running tests which require Python 3.7 or greater"

def python_lt_37_build(test):
return skip(
message=message,
subject="date",
description="invalidates non-padded month dates",
)(test) or skip(
message=message,
subject="date",
description="invalidates non-padded day dates",
)(test)
else:
def python_lt_37_build(test):
return


TestDraft3 = DRAFT3.to_unittest_testcase(
DRAFT3.tests(),
DRAFT3.format_tests(),
Expand All @@ -97,18 +115,9 @@ def narrow_unicode_build(test): # pragma: no cover
format_checker=draft3_format_checker,
skip=lambda test: (
narrow_unicode_build(test)
or python_lt_37_build(test)
or missing_format(draft3_format_checker)(test)
or complex_email_validation(test)
or skip(
message=bug(685),
subject="date",
description="invalidates non-padded month dates",
)(test)
or skip(
message=bug(685),
subject="date",
description="invalidates non-padded day dates",
)(test)
or skip(
message="Upstream bug in strict_rfc3339",
subject="date-time",
Expand Down Expand Up @@ -158,6 +167,7 @@ def narrow_unicode_build(test): # pragma: no cover
format_checker=draft4_format_checker,
skip=lambda test: (
narrow_unicode_build(test)
or python_lt_37_build(test)
or missing_format(draft4_format_checker)(test)
or complex_email_validation(test)
or skip(
Expand Down Expand Up @@ -237,6 +247,7 @@ def narrow_unicode_build(test): # pragma: no cover
format_checker=draft6_format_checker,
skip=lambda test: (
narrow_unicode_build(test)
or python_lt_37_build(test)
or missing_format(draft6_format_checker)(test)
or complex_email_validation(test)
or skip(
Expand Down Expand Up @@ -337,6 +348,7 @@ def narrow_unicode_build(test): # pragma: no cover
format_checker=draft7_format_checker,
skip=lambda test: (
narrow_unicode_build(test)
or python_lt_37_build(test)
or missing_format(draft7_format_checker)(test)
or complex_email_validation(test)
or skip(
Expand Down Expand Up @@ -368,16 +380,6 @@ def narrow_unicode_build(test): # pragma: no cover
subject="refRemote",
case_description="base URI change - change folder in subschema",
)(test)
or skip(
message=bug(685),
subject="date",
description="invalidates non-padded month dates",
)(test)
or skip(
message=bug(685),
subject="date",
description="invalidates non-padded day dates",
)(test)
or skip(
message="Upstream bug in strict_rfc3339",
subject="date-time",
Expand Down