Skip to content

ENH: infer Timestamp unit in non-iso paths #51039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 1, 2023

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@MarcoGorelli MarcoGorelli self-requested a review January 28, 2023 12:37
@MarcoGorelli MarcoGorelli added the Non-Nano datetime64/timedelta64 with non-nanosecond resolution label Jan 30, 2023
Comment on lines 37 to 43
def test_constructor_str_infer_reso(self):
# non-iso8601 path
ts = Timestamp("2016-01-01 1:30:01 PM")
assert ts.unit == "s"

ts = Timestamp("2016 June 3 15:25:01.345")
assert ts.unit == "ms"
Copy link
Member

@MarcoGorelli MarcoGorelli Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need 3 tests here, one for each path inside parse_datetime_string:

  • dt = _parse_delimited_date(date_string, dayfirst, out_bestunit)
  • _parse_dateabbr_string(
  • dt = dateutil_parse(date_string, default=_DEFAULT_DATETIME,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, updated

# non-iso8601 path

# _parse_delimited_date path
ts = Timestamp("2023/01/30")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this hits the iso path, you'll need

ts = Timestamp("01/30/2023")

also, would prefer to parametrise these - you can put _parse_delimited_date path (and the others) as id for the different cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to use the suggested string. didnt parametrize - on the margin id like to slow the growth of the pytest overhead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just out of interest, why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest collects something like 200k tests and last time i checked that added up to more than 1GB before we even start executing any tests. Similarly autouse fixtures get called for each test which means that much more overhead at runtime (i haven't measured this so assume it is a small-but-adds-up amount).

Doing a little bit more inside the tests and a little bit less outside can trim that footprint and overhead. Adding a couple cases here isn't a big deal, but on the margin i try to get in the habit of leaning against parametrization when it isn't a compelling use case.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @jbrockmendel

@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Jan 31, 2023
@MarcoGorelli MarcoGorelli merged commit aedf135 into pandas-dev:main Feb 1, 2023
@jbrockmendel jbrockmendel deleted the enh-infer-ts branch February 1, 2023 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Non-Nano datetime64/timedelta64 with non-nanosecond resolution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants