Skip to content

Commit 768bf49

Browse files
shawnheidejreback
authored andcommitted
BUG: fixes 13822, incorrect KeyError string with non-unique columns w…
closes #13822 Author: Shawn Heide <[email protected]> Closes #13845 from shawnheide/BUG_13822 and squashes the following commits: ae56be0 [Shawn Heide] BUG: fixes 13822, incorrect KeyError string with non-unique columns when missing column is accessed
1 parent 2f8fea7 commit 768bf49

File tree

3 files changed

+13
-1
lines changed

3 files changed

+13
-1
lines changed

doc/source/whatsnew/v0.19.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -863,3 +863,4 @@ Bug Fixes
863863

864864
- Bug in ``.to_excel()`` when DataFrame contains a MultiIndex which contains a label with a NaN value (:issue:`13511`)
865865
- Bug in ``pd.read_csv`` in Python 2.x with non-UTF8 encoded, multi-character separated data (:issue:`3404`)
866+
- Bug in ``Index`` raises ``KeyError`` displaying incorrect column when column is not in the df and columns contains duplicate values (:issue:`13822`)

pandas/core/indexing.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -1217,7 +1217,9 @@ def _convert_to_indexer(self, obj, axis=0, is_setter=False):
12171217
else:
12181218
(indexer,
12191219
missing) = labels.get_indexer_non_unique(objarr)
1220-
check = indexer
1220+
# 'indexer' has dupes, create 'check' using 'missing'
1221+
check = np.zeros_like(objarr)
1222+
check[missing] = -1
12211223

12221224
mask = check == -1
12231225
if mask.any():

pandas/tests/indexing/test_indexing.py

+9
Original file line numberDiff line numberDiff line change
@@ -1332,6 +1332,15 @@ def f():
13321332
self.assertEqual(result, 3)
13331333
self.assertRaises(ValueError, lambda: df.at['a', 0])
13341334

1335+
# GH 13822, incorrect error string with non-unique columns when missing
1336+
# column is accessed
1337+
df = DataFrame({'x': [1.], 'y': [2.], 'z': [3.]})
1338+
df.columns = ['x', 'x', 'z']
1339+
1340+
# Check that we get the correct value in the KeyError
1341+
self.assertRaisesRegexp(KeyError, "\['y'\] not in index",
1342+
lambda: df[['x', 'y', 'z']])
1343+
13351344
def test_loc_getitem_label_slice(self):
13361345

13371346
# label slices (with ints)

0 commit comments

Comments
 (0)