Skip to content

Cannot set value correctly #15413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fduxiao opened this issue Feb 15, 2017 · 7 comments · Fixed by #45154
Closed

Cannot set value correctly #15413

fduxiao opened this issue Feb 15, 2017 · 7 comments · Fixed by #45154
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@fduxiao
Copy link

fduxiao commented Feb 15, 2017

Code Sample, a copy-pastable example if possible

# It's pretty simple but it doesn't work.
# Here stocks is inside a class which type is pd.Series.
# I tried to print money / price * (1-r) and found it was not zero.
# However, self.stocks[ticker] is never changed which confuses me
self.stocks[ticker] += money / price * (1 - r)

Problem description

See the comments above. It only happens when I tried to use a certain set of data but works for other situations. I've tried to change the value of self.stocks[name] and something more strange happened.

# this works
self.stocks[ticker] = 6
self.stocks[ticker] += money / price * (1 - r)
# I was able to see the different value printed after the line
# but if I change it to
if self.stocks[ticker] == 0:
    self.stocks[ticker] = 0
self.stocks[ticker] += money / price* (1 - r)
# it repeated the unchanged result which drove me mad

Expected Output

self.stocks[ticker] should be changed

Output of pd.show_versions()

# Paste the output here pd.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Darwin OS-release: 16.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: zh_CN.UTF-8 LOCALE: zh_CN.UTF-8

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 32.2.0
Cython: None
numpy: 1.12.0
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.2
matplotlib: 2.0.0
openpyxl: 2.4.2
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.7.2
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Feb 15, 2017

you will need to show a copy-pastable example.

@jorisvandenbossche jorisvandenbossche added the Needs Info Clarification about behavior needed to assess issue label Feb 16, 2017
@fduxiao
Copy link
Author

fduxiao commented Feb 17, 2017

Sorry for replying so slowly. My codes are pretty long. Unzip the attachment, cd into the directory and run ./process.py (with python3). The problem happens in line 194 of backtesting.py when the parameter n_days is 5 in line 34 of process.py. That is what I've shown above.
mycode.zip

@jreback
Copy link
Contributor

jreback commented Feb 17, 2017

@fduxiao that's not acceptable. Please show a copy-pastable example. If you think this is a bug, then it will need to be narrowed down to a tests case.

@fduxiao
Copy link
Author

fduxiao commented Feb 18, 2017

When I tried to simplify the code I found the reason that I'd added a float to an element of an Series with dtype int64 and the result had been an int64, which caused this situation. Sorry for such a mistake and is there any good way to avoid it, since it's easy for one to forget a . after a number.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Feb 18, 2017

@fduxiao Based on what you describe of "added a float to an element of an Series with dtype int64 and the result had been an int64", I tried the following as a simple example:

In [28]: s = pd.Series([1,2,3])

In [29]: s[s==2] += 0.5

In [30]: s
Out[30]: 
0    1.0
1    2.5
2    3.0
dtype: float64

In [31]: s = pd.Series([1,2,3])

In [32]: s[1] += 0.5

In [33]: s
Out[33]: 
0    1
1    2
2    3
dtype: int64

In [36]: s = pd.Series([1,2,3])

In [37]: s.loc[1] += 0.5

In [38]: s
Out[38]: 
0    1.0
1    2.5
2    3.0
dtype: float64

In [39]: s = pd.Series([1,2,3])

In [40]: s.iloc[1] += 0.5

In [41]: s
Out[41]: 
0    1.0
1    2.5
2    3.0
dtype: float64

So indeed, scalar indexing with __getitem__ ([]) seems to be the exception compared to boolean getitem indexing and loc/iloc. That certainly looks like a bug to me, or at least a very confusing inconsistency.

@jorisvandenbossche jorisvandenbossche added Bug Indexing Related to indexing on series/frames, not to indexes themselves and removed Needs Info Clarification about behavior needed to assess issue labels Feb 18, 2017
@jorisvandenbossche
Copy link
Member

This probably has some roots in numpy behaviour:

In [43]: a = np.array([1,2,3])

In [44]: a[1] += 0.5

In [45]: a
Out[45]: array([1, 2, 3])

In [46]: a = np.array([1,2,3])

In [47]: a[a==2] += 0.5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-47-b1927385ce52> in <module>()
----> 1 a[a==2] += 0.5

TypeError: Cannot cast ufunc add output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

In [48]: a += 0.5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-48-633f1f7b6cb9> in <module>()
----> 1 a += 0.5

TypeError: Cannot cast ufunc add output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

Numpy takes the route of "inplace operation should not alter dtype" (which is certainly also sensible), but seems to cast the right hand side for scalar indexing, but raises an error in the other cases.
Pandas seems to take the other route changing the dtype if necessary in general (for most examples above), but not consistent.

@OmerJog
Copy link

OmerJog commented Nov 29, 2018

are there any plans to fix this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants