Binary operations between Numpy array and Pandas series convert to array dtype before executing #8674

8one6 · 2014-10-29T16:29:47Z

It looks like when you do a binary operation with a numpy array and a pandas series, the Pandas series is cast as the same type as the numpy array. In the case where the array is ints but the series is floats this can have odd results in Python 2.x with division. This was surprising to me because, element-wise, a float/int computation would upcast the denominator to a float. But not here in this vectorized case:

import numpy as np
import pandas as pd

j = np.array([0] * 5)
k = np.random.randn(5)

print j / k
print (j / pd.Series(k)).values
print (pd.Series(j) / k).values
print (pd.Series(j) / pd.Series(k)).values

####output####

[-0.  0. -0. -0.  0.]
[ inf  inf  inf  inf  inf]
[-0.  0. -0. -0.  0.]
[-0.  0. -0. -0.  0.]

So in the two "pure" cases (array and array, series and series) the ints are upcast to floats and we get 0's everywhere. In the 2nd mixed case (3rd case overall), everything gets converted to floats to match the array and everything goes fine. But in the 1st mixed case (2nd case overall) everything gets converted to ints to match the array and so you get what was, at least to me, an unexpected infinite result.

The text was updated successfully, but these errors were encountered:

jreback · 2014-10-29T20:48:47Z

actually this is a bug
this line: https://github.com/pydata/pandas/blob/master/pandas/core/common.py#L1351 is wrong
it should be startswith __r

reverse ops are passed to pandas (so in this example it is passed as (Series(j),k)
with the name being __rtruediv__
this is done so pandas handles these kind of things

so zeros can be filled correctly (as number does integer division and things like that so u get weird results at times)

so if u would like to do a or with your test cases and make that small change should work

jreback added IO CSV read_csv, to_csv Bug Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations and removed IO CSV read_csv, to_csv labels Oct 29, 2014

jreback added this to the 0.15.1 milestone Oct 29, 2014

jreback modified the milestones: 0.16.0, 0.15.2 Nov 30, 2014

jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015

mroeschke added a commit to mroeschke/pandas that referenced this issue Dec 30, 2016

TST: Series Division with zeros numpy array (pandas-dev#8674)

7a614c4

mroeschke mentioned this issue Dec 30, 2016

TST: Series Division with zeros numpy array (#8674) #15013

Merged

3 tasks

jorisvandenbossche modified the milestones: 0.20.0, Next Major Release Dec 30, 2016

jorisvandenbossche closed this as completed in #15013 Dec 30, 2016

jorisvandenbossche pushed a commit that referenced this issue Dec 30, 2016

TST: Series Division with zeros numpy array (#8674) (#15013)

d2a4b33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary operations between Numpy array and Pandas series convert to array dtype before executing #8674

Binary operations between Numpy array and Pandas series convert to array dtype before executing #8674

8one6 commented Oct 29, 2014

jreback commented Oct 29, 2014

Binary operations between Numpy array and Pandas series convert to array dtype before executing #8674

Binary operations between Numpy array and Pandas series convert to array dtype before executing #8674

Comments

8one6 commented Oct 29, 2014

jreback commented Oct 29, 2014