-
Notifications
You must be signed in to change notification settings - Fork 21
Add a to_array_obj
method to Column
#163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,54 @@ class Column: | |
constructor functions or an already-created dataframe object retrieved via | ||
|
||
""" | ||
def to_array_obj(self) -> object: | ||
""" | ||
Obtain an object that can be used as input to ``asarray`` or ``from_dlpack`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are very different where I don't think using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mostly, yes. The main pain point there is that the array API standard doesn't mandate that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So we can just drop |
||
|
||
The returned object must only be used for one thing: calling the ``asarray`` | ||
or ``from_dlpack`` functions of a library implementing the array API | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 for only |
||
standard, or equivalent ``asarray``/``from_dlpack`` functions to the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 |
||
one in that standard. In practice that means that the returned object | ||
may either already *be* an array type of a specific array library, or | ||
it may implement one or more of the following methods or behaviors: | ||
|
||
- ``__dlpack__`` | ||
- the Python buffer protocol | ||
- ``__array__`` | ||
- ``__array_interface__`` | ||
- ``__cuda_array_interface__`` | ||
- the Python ``Sequence`` interface | ||
- any other method that is known to work with an ``asarray`` function | ||
in a library of interest | ||
Comment on lines
+37
to
+38
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the goal of this function to work generally with multiple libraries There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'd say neither is 100% accurate. It's more the former, however it does make sense to add library-specific methods - and in fact, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess my fear is that say some dataframe library There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that fear is justified, because:
|
||
|
||
Importantly, the returned object must only implement/expose those | ||
methods that work. An ``asarray`` function may try to use any of the | ||
above methods, and the order in which it does so is not guaranteed. | ||
Hence it is expected that if a method is present, it works correctly | ||
and does not raise an exception (e.g., because of an unsupported dtype | ||
or device). | ||
rgommers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. admonition:: Tip | ||
|
||
One way to expose methods only if they will work for the dtype, | ||
device, and other characteristics of the column is to hide access to | ||
the methods dynamically, so ``hasattr`` does the right thing. For | ||
example:: | ||
|
||
def __dir__(self): | ||
methods = dir(self.__class__) | ||
attrs = list(self.__dict__.keys()) | ||
keys = methods + attrs | ||
if not self.dtype in _dlpack_supported_dtypes: | ||
keys.remove("__dlpack__") | ||
|
||
return keys | ||
|
||
def __dlpack__(self): | ||
... | ||
|
||
""" | ||
|
||
@classmethod | ||
def from_sequence(cls, sequence: Sequence[object], dtype: dtype) -> Column: | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to the PR bu the sentence seems to be missing something?