|
| 1 | +# Python builtin types and duck typing |
| 2 | + |
| 3 | +Use of Python's builtin types - `bool`, `int`, `float`, `str`, `dict`, `list`, |
| 4 | +`tuple`, `datetime.datetime`, etc. - is often natural and convenient. However, |
| 5 | +it is also potentially problematic when trying to write performant dataframe |
| 6 | +library code or supporting devices other than CPU. |
| 7 | + |
| 8 | +This standard specifies the use of Python types in quite a few places, and uses |
| 9 | +them as type annotations. As a concrete example, consider the `mean` method and |
| 10 | +the `float` it is documented to return, in combination with the `__gt__` method |
| 11 | +(i.e., the `>` operator) on the dataframe: |
| 12 | + |
| 13 | +```python |
| 14 | +class DataFrame: |
| 15 | + def __gt__(self, other: DataFrame | Scalar) -> DataFrame: |
| 16 | + ... |
| 17 | + def get_column_by_name(self, name: str, /) -> Column: |
| 18 | + ... |
| 19 | + |
| 20 | +class Column: |
| 21 | + def mean(self, skip_nulls: bool = True) -> float: |
| 22 | + ... |
| 23 | + |
| 24 | +larger = df2 > df1.get_column_by_name('foo').mean() |
| 25 | +``` |
| 26 | + |
| 27 | +For a GPU dataframe library, it is desirable for all data to reside on the GPU, |
| 28 | +and not incur a performance penalty from synchronizing instances of Python |
| 29 | +builtin types to CPU. In the above example, the `.mean()` call returns a |
| 30 | +`float`. It is likely beneficial though to implement this as a library-specific |
| 31 | +scalar object which duck types with `float`. This means that it should (a) have |
| 32 | +the same semantics as a builtin `float` when used within a library, and (b) |
| 33 | +support usage as a `float` outside of the library (i.e., implement |
| 34 | +`__float__`). Duck typing is usually not perfect, for example `isinstance` |
| 35 | +usage on the float-like duck type will behave differently. Such explicit "type |
| 36 | +of object" checks don't have to be supported. |
| 37 | + |
| 38 | +The following design rule applies everywhere builtin Python types are used |
| 39 | +within this API standard: _where a Python builtin type is specified, an |
| 40 | +implementation may always replace it by an equivalent library-specific type |
| 41 | +that duck types with the Python builtin type._ |
0 commit comments