Return type of pow #174

MarcoGorelli · 2023-05-24T11:20:44Z

Seems like there's some inconsistencies:

In [6]: pd.Series([1,2,3])**2
Out[6]:
0    1
1    4
2    9
dtype: int64

In [7]: pd.Series([1,2,3])**-1
---------------------------------------------------------------------------
ValueError: Integers to negative integer powers are not allowed.

In [8]: pl.Series([1,2,3]).pow(2)
Out[8]:
shape: (3,)
Series: '' [f64]
[
        1.0
        4.0
        9.0
]

In [9]: pl.Series([1,2,3]).pow(-1)
Out[9]:
shape: (3,)
Series: '' [f64]
[
        1.0
        0.5
        0.333333
]

polars always returns floats, whereas pandas returns either integers or floats, and may error based on the value of the exponent

What do we want to do here?

rgommers · 2023-05-24T14:47:36Z

I think we should refer to the array API standard's description for pow. It's admittedly a bit hairy, but for any numerical behavior like this I think array libraries have thought about this a lot harder than dataframe libraries, and we should not reinvent this particular wheel.

For the examples given, that spec says, for col with integer dtype:

col**2 should given integer dtype result
col**(-1) is implementation-defined and may not be allowed

In this particular case I think the Polars choice isn't completely unreasonable, because it's the other choice that could be made to extrapolate Python's builtin behavior for scalars to a column:

>>> 2**2
4
>>> 2**-1
0.5
>>> type(2**2)
<class 'int'>
>>> type(2**-1)
<class 'float'>

But it's the opposite choice made by all array libraries and by Pandas, and makes it harder to work with lower-precision dtypes when everything ends up being float64. So it's not an ideal choice either and I'm hoping it can still be reversed.

>>> import polars as pl
>>> col = pl.Series([1, 2, 3], dtype=pl.Int32)
>>> col**2
shape: (3,)
Series: '' [f64]
[
        1.0
        4.0
        9.0
]
>>> col.pow(2).dtype
Float64
>>> (col**2).dtype
Float64

I don't see any documented casting rules in the Polars docs, although it seems dtype-preserving in general with pow being an exception, and there is manual casting support.

I haven't checked all the other dataframe libraries yet, would be good to check that first.

If it's not possible to make the pow behavior uniform, I think the other option is to recommend the array API standard behavior but not make it mandatory.

MarcoGorelli · 2023-05-24T15:35:43Z

If it's not possible to make the pow behavior uniform

We can always work around this in the standard, no big deal - following the Array API looks good to me

jorisvandenbossche · 2023-05-25T13:29:31Z

As another data point: the pyarrow.compute power kernel for integer data and integer exponent preserves the integer dtype for positive integers, and raises an error for negative integers. So that should be compatible with the Array API specification.

MarcoGorelli · 2023-05-26T10:44:04Z

thanks @jorisvandenbossche ! I like that, I'd suggest standardising to that

MarcoGorelli · 2023-06-20T17:20:57Z

in the last call we went for following the pyarrow behaviour

MarcoGorelli mentioned this issue May 26, 2023

pow: return Int64 for Int64 ** Int64, and raise if exponent is negative integer? pola-rs/polars#9051

Closed

2 tasks

MarcoGorelli mentioned this issue Jun 20, 2023

Note __pow__ return type #182

Merged

MarcoGorelli closed this as completed in #182 Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Return type of pow #174

Return type of pow #174

MarcoGorelli commented May 24, 2023

rgommers commented May 24, 2023

Uh oh!

MarcoGorelli commented May 24, 2023

Uh oh!

jorisvandenbossche commented May 25, 2023 •

edited

Loading

Uh oh!

MarcoGorelli commented May 26, 2023

Uh oh!

MarcoGorelli commented Jun 20, 2023

Uh oh!

Return type of pow #174

Return type of pow #174

Comments

MarcoGorelli commented May 24, 2023

rgommers commented May 24, 2023

Uh oh!

MarcoGorelli commented May 24, 2023

Uh oh!

jorisvandenbossche commented May 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MarcoGorelli commented May 26, 2023

Uh oh!

MarcoGorelli commented Jun 20, 2023

Uh oh!

jorisvandenbossche commented May 25, 2023 •

edited

Loading