.ix strange bug for float index #780

xdong · 2012-02-14T07:23:50Z

In [1]: import pandas

In [2]: index = [52195.504153, 52196.303147, 52198.369883]

In [3]: a = pandas.DataFrame(randn(3, 2), index)

In [4]: a
Out[4]:
0 1
52195.504153 1.367681 0.243237
52196.303147 -0.745796 -1.054106
52198.369883 -1.462461 -0.683286

In [5]: a.ix[52195.:52196.]
Out[5]:
Empty DataFrame
Columns: array([0, 1])
Index: array([], dtype=object)

In [6]: a.ix[52195.1:52196.5]
Out[6]:
Empty DataFrame
Columns: array([0, 1])
Index: array([], dtype=object)

In [7]: a.ix[52195.1:52196.6]
Out[7]:
0 1
52195.504153 1.367681 0.243237
52196.303147 -0.745796 -1.054106

xdong · 2012-02-15T18:19:38Z

Thanks for the quick fix. I was going to comment on another issue on float based slicing, but I saw that you had it fixed in commit bc1932f.

Now the float based slicing works as expected when the floats are whole numbers. For example, df.ix[2.0:5.0] is considered label-based as promised in the documentation. However, if I mix integer and float then:

df.ix[2:5.0] is interpretated as interger-based;
df.ix[2:5.1] is interpretated as label-based.

I am worried that it may introduce subtle bugs (admittedly, it's bad practice to mix integer and float.)

adamklein · 2012-02-15T18:57:58Z

There is definitely still some weirdness in slicing. It's been a game of whack-a-mole.

Part of the complexity is that slicing is context-dependent on what's in the index.

I believe the slicing you point out will be consistent as long as the index type doesn't change ... what I mean is:

In [76]: x = Index([1.5, 2, 3, 4, 5])

In [77]: df = DataFrame(rand(5,5), index=x)

In [78]: df.ix[1.5:4]
Out[78]: 
     0        1        2         3       4     
1.5  0.06102  0.25070  0.009453  0.6829  0.6631
2    0.81916  0.95604  0.397659  0.7903  0.3951
3    0.30179  0.64651  0.701975  0.3746  0.6955
4    0.13221  0.04839  0.788082  0.3093  0.1095

In [79]: df.ix[4:5]
Out[79]: 
   0       1       2        3       4      
5  0.5301  0.8182  0.05318  0.8247  0.01699

In [80]: df.ix[1.5:4].index
Out[80]: Index([1.5, 2, 3, 4], dtype=object)

In [81]: df.ix[4:5].index
Out[81]: Int64Index([5])

This is a bit surprising, that depending on where in the index you slice, you get integer or label based.

I think that maybe the index shouldn't change types when it is subsetted (ie, if it's not an Int64Index, should never become one when sliced).

Furthermore, from the docs: "Therefore, advanced indexing with .ix will always attempt label-based indexing, before falling back on integer-based indexing."

This doesn't seem to be true per the last output, may need fixing here.

Fix pandas-dev#777: Handle zero columns when converting to unicode

adamklein closed this as completed in 35fcc17 Feb 14, 2012

adamklein reopened this Feb 15, 2012

wesm added a commit that referenced this issue Feb 18, 2012

BUG: fix mixed float/int indexing slicing problem described in #780

0b52b1a

wesm closed this as completed Feb 18, 2012

wesm mentioned this issue Feb 18, 2012

Question: disable integer indexing altogether for object indexing CONTAINING integers? #798

Closed

dan-nadler pushed a commit to dan-nadler/pandas that referenced this issue Sep 23, 2019

Merge pull request pandas-dev#780 from shashank88/zero_column_fix

4393954

Fix pandas-dev#777: Handle zero columns when converting to unicode

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

.ix strange bug for float index #780

.ix strange bug for float index #780

xdong commented Feb 14, 2012

xdong commented Feb 15, 2012

Uh oh!

adamklein commented Feb 15, 2012

Uh oh!

Uh oh!

.ix strange bug for float index #780

.ix strange bug for float index #780

Comments

xdong commented Feb 14, 2012

xdong commented Feb 15, 2012

Uh oh!

adamklein commented Feb 15, 2012

Uh oh!