Skip to content

Add 2d constructor #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 1, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions spec/API_specification/dataframe_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,20 @@ def dataframe_from_dict(data: Mapping[str, Column[Any]]) -> DataFrame:
"""
...

def dataframe_from_2d_array(array: Any) -> DataFrame:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have *, names: list[str] I think, with the length of the list matching the number of columns?

Also a dtype keyword, because we don't really want to get in the business of inferring it (going from array.dtype to a dtype in this namespace is nontrivial; much easier for the caller who has the array namespace at hand).

"""
Construct DataFrame from 2D array.

See `column_from_sequece` for related 1D function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in sequence

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More importantly, I foresee a column_from_1d_array request, because a 1-D array is not a sequence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware, thanks

just for my understanding, what makes it not a sequence? from the docs https://docs.python.org/3/glossary.html#term-sequence :

An iterable which supports efficient element access using integer indices via the getitem() special method and defines a len() method that returns the length of the sequence

, don't arrays meet the requirement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See data-apis/array-api#481 for len(). A second issue is that iterating over it is going to return 0-D arrays of the dtype of the input array, not Python scalars.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are arrays iterable per the array API? I would strongly prefer to differentiate array handling vs iterable handling since you can often handle array zero copy or at least in very efficient ways relative to iterables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I've added column from 1d too


Only Array-API-compliant 2D arrays are supported.

Returns
-------
DataFrame
"""
...

class null:
"""
A `null` object to represent missing data.
Expand Down