-
Notifications
You must be signed in to change notification settings - Fork 21
Add namespace.date #289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add namespace.date #289
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
eb03b45
drive by: put __all__ in alphabetical order
MarcoGorelli 686a89a
add to namespace, type return as scalar
MarcoGorelli 5c1bf67
Merge branch 'main' into namespace-date
MarcoGorelli 556376a
Merge remote-tracking branch 'upstream/main' into namespace-date
MarcoGorelli 82309fe
Merge branch 'namespace-date' of github.com:MarcoGorelli/dataframe-ap…
MarcoGorelli 99d3ce9
remove outdated type ignore
MarcoGorelli 079d320
note range of support
MarcoGorelli 6898510
Update spec/API_specification/dataframe_api/__init__.py
MarcoGorelli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to define rules for this:
datetime.date
doesn't allow for negative years. Pandas, PyArrow, cudf, duckdb, etc. all support negative dates. Unsure if Polars supports negative dates, my attempts at constructing / casting into a negative date Series / Scalar have failed thus far.datetime.date
has a max year of 9999 which doesn't support the full range of years that can be supported with anint32
number of days since epoch, which is what PyArrow, cudf, duckdb, and others support. Again, unsure what Polars supports here.Do we wish to follow the constraints of the standard library
datetime.date
or do something else?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be supported in polars, could you show what you tried that failed please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these throw, in either
__repr__
or in trying to take a scalar element via__getitem__(0)
likely from trying to use Pythondatetime.datetime
ordatetime.date
objects:Additionally, trying a few datetime operations seems to silently yield incorrect results:
PyArrow similarly fails in
__repr__
from trying to usedatetime.date
objects:But when calculating things, pyarrow seems to be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those throw when you try to bring unsupported things into Python land
but they're valid so long as you stay within Polars:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example with using
.dt.year()
returns incorrect results:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, that looks like a bug, thanks! pola-rs/polars#11991
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that should throw an "out-bounds" error, as it's outside the bounds supported by the underlying
Chrono
library in Rust (which are approx. -280,000 to 280,000)But negative dates are supported - one reason
pl.date
exists (other than to support expressions) is that you pass things to it which the stdlibdate
doesn't support, like negative dates and nanosecondsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, well either way it looks like there's different restrictions across different libraries and lots of bugs across the board with regards to
__repr__
and__getitem__
.It looks like Polars limits dates to greater than or equal to
-262145-01-01
and less than or equal to262143-12-31
based on its Rust implementation. Other libraries generally use a 32-bit signed integer of days since epoch which yields greater than or equal to-5877641-06-23
and less than or equal to5881580-07-11
.How do you propose we move forward here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion, but I'd be OK with just saying that the full 32-bit signed integer range of days is supported and just noting that Polars isn't 100% compliant in this regard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. I'll start raising some issues upstream across libraries regarding the
__repr__
issues I uncovered here.