Skip to content

BUG: duplicate indexing on non-integer index with positional indexers failing in py3 #13427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexanderwhatley opened this issue Jun 11, 2016 · 6 comments · Fixed by #41482
Closed
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@alexanderwhatley
Copy link

Code Sample, a copy-pastable example if possible

df = pd.DataFrame({"a" : [0,1,2], "b" : [1,2,3]})
df[["a", "a"]].apply(lambda x: x[0] + x[1], axis = 1)

Expected Output

0 0
1 2
2 4
dtype: int64

output of pd.show_versions()

Traceback (most recent call last):
File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 1980, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas\index.pyx", line 103, in pandas.index.IndexEngine.get_value (pandas\index.c:3332)
File "pandas\index.pyx", line 111, in pandas.index.IndexEngine.get_value (pandas\index.c:3035)
File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:3955)
File "pandas\index.pyx", line 169, in pandas.index.IndexEngine._get_loc_duplicates (pandas\index.c:4236)
TypeError: unorderable types: str() > int()

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4061, in apply
return self._apply_standard(f, axis, reduce=reduce)
File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4157, in _apply_standard
results[i] = func(v)
File "", line 1, in
File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\series.py", line 583, in getitem
result = self.index.get_value(self, key)
File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 2000, in get_value
raise IndexError(key)
IndexError: (0, 'occurred at index 0')

This is the error I get when running on Python 3.5.1/Pandas 0.18.1 on Windows. I get the expected output on Linux running Python 2.7.11/Pandas 0.18.1.

@alexanderwhatley
Copy link
Author

Just as an update - it happens to work on Python 2.7.11/Pandas 0.18.1 on Windows as well.

@jreback
Copy link
Contributor

jreback commented Jun 12, 2016

The root example is here:

This works in py2

In [6]: s = Series([0,0],['a','a'])

In [7]: s[0]
Out[7]: 0

In [8]: s[1]
Out[8]: 0

We explicity catch the unordered exception and raise an error break out of the indexing code, but not sure what is happening after. This should work.

@jreback
Copy link
Contributor

jreback commented Jun 12, 2016

What you are writing is very fragile code anyhow, using getitem depends on the index, which in this case doesn't work (as they are strings), using .iloc makes this explicit. Of course you already know that you shouldn't be using .apply in this case (or in reality MOST cases). There is really no need.

In [1]: df = pd.DataFrame({"a" : [0,1,2], "b" : [1,2,3]})

In [2]: df[['a','a']].apply(lambda x: x.iloc[0] + x.iloc[1], axis=1)
Out[2]: 
0    0
1    2
2    4
dtype: int64

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode 2/3 Compat labels Jun 12, 2016
@jreback jreback added this to the Next Major Release milestone Jun 12, 2016
@jreback jreback changed the title Error message thrown in Python 3.5.1/Pandas 0.18.1 on Windows, but not on Python 2.7.11/Pandas 0.18.1 on Linux BUG: duplicate indexing on non-integer index with positional indexers failing in py3 Jun 12, 2016
@topper-123
Copy link
Contributor

Removing the 2/3 Compat label, as we've dropped Python2 support and the example still fails on master.

@simonjayhawkins
Copy link
Member

this is now fixed on master

>>> import pandas as pd
>>>
>>> pd.__version__
'1.1.0.dev0+1045.g30724b9c6'
>>>
>>> df = pd.DataFrame({"a": [0, 1, 2], "b": [1, 2, 3]})
>>>
>>> df[["a", "a"]].apply(lambda x: x[0] + x[1], axis=1)
0    0
1    2
2    4
dtype: int64

@simonjayhawkins simonjayhawkins added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Mar 31, 2020
@simonjayhawkins
Copy link
Member

this is now fixed on master

fixed in #29700

7eb0db3 is the first new commit
commit 7eb0db3
Author: jbrockmendel [email protected]
Date: Mon Nov 25 15:45:27 2019 -0800

BUG: Index.get_loc raising incorrect error, closes #29189 (#29700)

@mroeschke mroeschke modified the milestones: Contributions Welcome, 1.3 May 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants