Skip to content

Step by step guide to understand protocol discussion, and decisions to be made #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

datapythonista
Copy link
Member

This document complements #30, and provides detailed information on what are the different options regarding the dataframe exchange protocol, and what are the decisions that need to be made.

Since there are different backgrounds in the group, the aim is that everybody reading this document should be in the same page understanding the different topics of the discussion.

Publishing as a PR, so people can:

  • Disagree on the topics being presented, the different options offered, propose clarifications...
  • Comment with opinions to specific points

At the end of the document I summarize some of the decisions that IMHO need to be made, and that we can discuss in the call on Thursday. Feel free to propose other points.

- Lack of Python object support (e.g. strings as Python objects in pandas)
- Booleans represented in a single bit, and some implementations may prefer a one byte representation
- No support for bfloat type (not sure if any dataframe implementation uses them currently)
- Limited number of units for timestamps and durations (nano/micro/milli-seconds, and seconds are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrow supports "extension types" (consisting of one of its native types + metadata on how to interpret it), so as long as the physical storage is supported by Arrow (which I think is the case for all those examples), if needed, all those types could be supported as extension types defined by the dataframe protocol.

@rgommers rgommers closed this Nov 14, 2020
@rgommers rgommers deleted the branch master November 14, 2020 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants