Duplicate index with FloatIndex giving 'ValueError: Length mismatch' #7143

jorisvandenbossche · 2014-05-16T13:13:37Z

Not certain if this is a bug or defined behaviour (but then the error message is not clear in any case).

In 0.13.1:

In [28]: df = pd.DataFrame(np.random.randn(9).reshape(3,3), index=[0.1,0.2,0.2],
 columns=['a','b','c'])
In [29]: df
Out[29]:
            a         b         c
0.1  1.711117  1.218853 -1.322363
0.2  0.956266  0.230374 -1.005935
0.2 -0.137729 -0.993931 -0.902793

In [30]: df.ix[0.2,'a']
Out[30]: array([ 0.95626607, -0.13772877])

In [31]: df.ix[0.2]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
ValueError: Length mismatch: Expected axis has 0 elements, new values have 2 ele
ments

In master, both (df.loc[0.2] and df.loc[0.2, 'a']) give this error message. Wile for integer index, this works.

The text was updated successfully, but these errors were encountered:

jreback · 2014-05-16T13:27:39Z

hmm...would say that's a bug (also your first example should return a Series).

jorisvandenbossche · 2014-05-16T13:30:11Z

about the Series, that is indeed how it is with integer indices

cpcloud · 2014-05-16T15:08:30Z

hm duplicate fun i'll take this ... @jreback is this a blocker?

cpcloud · 2014-05-16T15:08:42Z

oh nvm i c u marked as 0.14.1

jreback · 2014-05-16T15:16:26Z

sure

cpcloud · 2014-05-17T03:52:47Z

this has been an enlightening experience. @jreback you are a hero for squashing all of these duplicate indexing bugs.

couple of things:

i've fixed the ValueError bug, just needed an annoying copypaste of the IndexEngine._maybe_get_bool_indexer with a type change and a subtle slicing KeyError because of duplicates. This should be fused-typed at some point to eliminate the copypasting between the different engine types on this method
doesn't look like ix really ever returned a Series for duplicate indices, and that's because it calls self.obj.get_value(*key) which is designed for single element access and directly pulls the ndarray from the underlying index engine.

BUT

if you remove that line then _getitem_tuple does upcasting and the aptly named TestDataFrame.test_single_element_ix_dont_upcast breaks 😞

i see two ways to deal with this:

break the test (bad)
don't upcast (deeper into the rabbit hole)

I'll submit a PR for the ValueError and open a separate issue for the ix insanity.

jreback added Bug labels May 16, 2014

jreback added this to the 0.14.1 milestone May 16, 2014

cpcloud self-assigned this May 16, 2014

This was referenced May 17, 2014

BUG: allow dup indexing with Float64Index #7149

Merged

ix should return a Series on duplicate selection #7150

Closed

cpcloud closed this as completed in #7149 Jun 1, 2014

wesm unassigned cpcloud Oct 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate index with FloatIndex giving 'ValueError: Length mismatch' #7143

Duplicate index with FloatIndex giving 'ValueError: Length mismatch' #7143

jorisvandenbossche commented May 16, 2014

jreback commented May 16, 2014

jorisvandenbossche commented May 16, 2014

cpcloud commented May 16, 2014

cpcloud commented May 16, 2014

jreback commented May 16, 2014

cpcloud commented May 17, 2014

Duplicate index with FloatIndex giving 'ValueError: Length mismatch' #7143

Duplicate index with FloatIndex giving 'ValueError: Length mismatch' #7143

Comments

jorisvandenbossche commented May 16, 2014

jreback commented May 16, 2014

jorisvandenbossche commented May 16, 2014

cpcloud commented May 16, 2014

cpcloud commented May 16, 2014

jreback commented May 16, 2014

cpcloud commented May 17, 2014