Skip to content

API: make Timestamp/Timedelta _as_unit public as_unit #48819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Nov 10, 2022

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

round_ok : bool, default True
If False and the conversion requires rounding, raise.

Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an Examples section too? (Hoping to have all public APIs have examples in the near future)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would make sense, but first i think we should determine if we want unit to somehow show up in the repr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, frequency didn't end up in the repr before right? (just as a comparison note)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 1.5.x freq is in the repr when it is non-None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be helpful then to have the unit resolution show up in the repr then (guess it would be an API change)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah that's on my todo list, just not a priority ATM

@mroeschke mroeschke added Timedelta Timedelta data type Non-Nano datetime64/timedelta64 with non-nanosecond resolution Timestamp pd.Timestamp and associated methods labels Sep 28, 2022
@jbrockmendel
Copy link
Member Author

@jreback mentioned this earlier this week

@jbrockmendel
Copy link
Member Author

Following today's call, I'm leaning towards changing "reso" to "creso" for the cdef methods and "unit" to "reso" for the python methods.

@jbrockmendel
Copy link
Member Author

After spending more time on this, im again really reticent to use "reso" and "as_reso" for the public keyword/attribute/method referring to the unit abbreviation (s, ms, us, ns). "unit" is just used everywhere else. DatetimeTZDtype refers to it as .unit. numpy docs call it unit (though it isnt a keyword in the dt64/td64 constructors). IIUC pyarrow dtypes have a .unit attribute for this.

I think the public-facing keyword/attribute/method should be unit/unit/as_unit. The argument against has been that in the constructors this means "unit" plays two roles: 1) how to interpret int inputs, 2) how to represent the data internally. Increasingly I think that if a user passes ints and unit="s", it makes more sense to keep this as second-resolution than cast to nanosecond (and should be more performant). Moreover, doing this would avoid some overflow cases.

That leaves cases where a user passes a unit that is supported for input but not for storage, i.e. Y, M, D, h, m. For these we'd treat the inputs the same and use the closest-supported (i.e. second) for storage.

Thoughts?

@mroeschke
Copy link
Member

I think the public-facing keyword/attribute/method should be unit/unit/as_unit

Yeah I would be okay if the "problematic" situation of to_datetime/to_timedelta/Timestamp/TImedelta(ints, unit="s") to interpret [ints] as epoch seconds and having the resulting unit be seconds, so having the public naming be unit would be okay to me.

@jbrockmendel jbrockmendel marked this pull request as draft October 18, 2022 16:15
@jbrockmendel jbrockmendel marked this pull request as ready for review November 2, 2022 19:04
@jbrockmendel
Copy link
Member Author

all good here? we have the option of changing unit->reso later, but for now id like to get started on some follow-ups to this.

@mroeschke mroeschke added this to the 2.0 milestone Nov 10, 2022
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am still partial to unit over reso (for now). Good to have examples in as_unit once the unit-in-the-repr question is resolved.

@mroeschke mroeschke merged commit f3c46cd into pandas-dev:main Nov 10, 2022
@mroeschke
Copy link
Member

Thanks @jbrockmendel

codamuse pushed a commit to codamuse/pandas that referenced this pull request Nov 12, 2022
* API: make Timestamp/Timedelta _as_unit public as_unit

* update test

* update test

* update tests

* fix pyi typo

* fixup

* fixup
mliu08 pushed a commit to mliu08/pandas that referenced this pull request Nov 27, 2022
* API: make Timestamp/Timedelta _as_unit public as_unit

* update test

* update test

* update tests

* fix pyi typo

* fixup

* fixup
davetapley added a commit to JEFuller/pandas-stubs that referenced this pull request Nov 10, 2023
davetapley added a commit to JEFuller/pandas-stubs that referenced this pull request Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Non-Nano datetime64/timedelta64 with non-nanosecond resolution Timedelta Timedelta data type Timestamp pd.Timestamp and associated methods
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants