Skip to content

Binary operations between Numpy array and Pandas series convert to array dtype before executing #8674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
8one6 opened this issue Oct 29, 2014 · 1 comment · Fixed by #15013
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations
Milestone

Comments

@8one6
Copy link

8one6 commented Oct 29, 2014

It looks like when you do a binary operation with a numpy array and a pandas series, the Pandas series is cast as the same type as the numpy array. In the case where the array is ints but the series is floats this can have odd results in Python 2.x with division. This was surprising to me because, element-wise, a float/int computation would upcast the denominator to a float. But not here in this vectorized case:

import numpy as np
import pandas as pd

j = np.array([0] * 5)
k = np.random.randn(5)

print j / k
print (j / pd.Series(k)).values
print (pd.Series(j) / k).values
print (pd.Series(j) / pd.Series(k)).values

####output####

[-0.  0. -0. -0.  0.]
[ inf  inf  inf  inf  inf]
[-0.  0. -0. -0.  0.]
[-0.  0. -0. -0.  0.]

So in the two "pure" cases (array and array, series and series) the ints are upcast to floats and we get 0's everywhere. In the 2nd mixed case (3rd case overall), everything gets converted to floats to match the array and everything goes fine. But in the 1st mixed case (2nd case overall) everything gets converted to ints to match the array and so you get what was, at least to me, an unexpected infinite result.

@jreback
Copy link
Contributor

jreback commented Oct 29, 2014

actually this is a bug
this line: https://github.com/pydata/pandas/blob/master/pandas/core/common.py#L1351 is wrong
it should be startswith __r

reverse ops are passed to pandas (so in this example it is passed as (Series(j),k)
with the name being __rtruediv__
this is done so pandas handles these kind of things

so zeros can be filled correctly (as number does integer division and things like that so u get weird results at times)

so if u would like to do a or with your test cases and make that small change should work

@jreback jreback added IO CSV read_csv, to_csv Bug Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations and removed IO CSV read_csv, to_csv labels Oct 29, 2014
@jreback jreback added this to the 0.15.1 milestone Oct 29, 2014
@jreback jreback modified the milestones: 0.16.0, 0.15.2 Nov 30, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
mroeschke added a commit to mroeschke/pandas that referenced this issue Dec 30, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, Next Major Release Dec 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants