You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the example below, [13] gives a different result than [11], even though you would think that [12] has no effect on data.
Python 2.7.8 (default, Oct 19 2014, 16:02:00)
Type "copyright", "credits" or "license" for more information.
IPython 2.1.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: d = {'COLA': {0: 0, 1: 0, 2: 0},
...: 'COLB': {0: 'a', 1: 'a', 2: 'a'},
...: 'COLC': {0: 0, 1: 0, 2: 1}}
In [4]: df = pd.DataFrame(d)
In [5]: df
Out[5]:
COLA COLB COLC
0 0 a 0
1 0 a 0
2 0 a 1
In [6]: df = df.groupby(['COLA','COLB'])['COLC']\
...: .agg({'Zeros': lambda x: 0,
...: 'Averages': lambda x: 100.*x.mean(),
...: 'Weird_stuff': np.size})\
...: .unstack()
In [7]: df
Out[7]:
Averages Weird_stuff Zeros
COLB a a a
COLA
0 33.333333 3 0
In [8]: df.columns
Out[8]:
MultiIndex(levels=[[u'Averages', u'Weird_stuff', u'Zeros'], [u'a']],
labels=[[0, 1, 2], [0, 0, 0]],
names=[None, u'COLB'])
In [9]: df.index
Out[9]: Int64Index([0], dtype='int64')
In [10]: df['Weird_stuff'] = df['Weird_stuff'].apply(lambda x: 1000000., axis=1)
/usr/local/lib/python2.7/site-packages/numpy/lib/function_base.py:3612: FutureWarning: in the future negative indices will not be ignored by `numpy.delete`.
"`numpy.delete`.", FutureWarning)
In [11]: df
Out[11]:
Averages Weird_stuff Zeros
COLB a a a
COLA
0 33.333333 1000000 0
In [12]: df['Zeros']
Out[12]:
COLB a
COLA
0 0
In [13]: df # Look the value below ('Weird_stuf', 'a') here and above.
Out[13]:
Averages Weird_stuff Zeros
COLB a a a
COLA
0 33.333333 3 0
In [14]: pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Darwin
OS-release: 14.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: pl_PL.UTF-8
pandas: 0.15.1
nose: 1.3.0
Cython: 0.20.2
numpy: 1.9.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 2.1.0
sphinx: 1.2.2
patsy: 0.2.1
dateutil: 2.2
pytz: 2014.7
bottleneck: 0.8.0
tables: 3.1.0
numexpr: 2.3.1
matplotlib: 1.3.1
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: 3.3.2
bs4: 4.3.2
html5lib: 0.999
httplib2: 0.9
apiclient: 1.2
rpy2: None
sqlalchemy: 0.9.7
pymysql: None
psycopg2: None
The text was updated successfully, but these errors were encountered:
kubaczer
changed the title
BUG: Accessing DataFrame column *seems* to modify its content
BUG: Accessing DataFrame multi-index column *seems* to modify its content
Nov 18, 2014
I think this is a manifestation of the same issue as in #8844
Essentially when you unstack the levels of the multi-index (the columns here) are not changes, only the labels are. (but the levels which don't appear are not included in the labels so it looks right).
so far so good. However, a further operation can cause this to break. The soln is to essentially a deep copy of the index during the unstack operation.
In the example below,
[13]
gives a different result than[11]
, even though you would think that[12]
has no effect on data.The text was updated successfully, but these errors were encountered: