Skip to content

Commit f9694d2

Browse files
Saravia RajalSaravia Rajal
Saravia Rajal
authored and
Saravia Rajal
committed
BUG: cant modify df with duplicate index (#17105)
1 parent b02c69a commit f9694d2

File tree

3 files changed

+19
-1
lines changed

3 files changed

+19
-1
lines changed

doc/source/whatsnew/v0.23.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1243,6 +1243,7 @@ Indexing
12431243
- Bug in ``Series.is_unique`` where extraneous output in stderr is shown if Series contains objects with ``__ne__`` defined (:issue:`20661`)
12441244
- Bug in ``.loc`` assignment with a single-element list-like incorrectly assigns as a list (:issue:`19474`)
12451245
- Bug in partial string indexing on a ``Series/DataFrame`` with a monotonic decreasing ``DatetimeIndex`` (:issue:`19362`)
1246+
- Fixed to allow modifying ``DataFrame`` with duplicate Index (:issue:`17105`)
12461247

12471248
MultiIndex
12481249
^^^^^^^^^^

pandas/core/indexing.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1319,7 +1319,7 @@ def _convert_to_indexer(self, obj, axis=None, is_setter=False):
13191319
(indexer,
13201320
missing) = labels.get_indexer_non_unique(objarr)
13211321
# 'indexer' has dupes, create 'check' using 'missing'
1322-
check = np.zeros_like(objarr)
1322+
check = np.zeros(len(objarr))
13231323
check[missing] = -1
13241324

13251325
mask = check == -1

pandas/tests/indexing/test_indexing.py

+17
Original file line numberDiff line numberDiff line change
@@ -1018,3 +1018,20 @@ def test_validate_indices_high():
10181018
def test_validate_indices_empty():
10191019
with tm.assert_raises_regex(IndexError, "indices are out"):
10201020
validate_indices(np.array([0, 1]), 0)
1021+
1022+
1023+
def test_modify_with_duplicate_index():
1024+
trange = pd.date_range(start=pd.Timestamp(year=2017, month=1, day=1),
1025+
end=pd.Timestamp(year=2017, month=1, day=5))
1026+
1027+
# insert a duplicate element to the index
1028+
trange = trange.insert(loc=5, item=pd.Timestamp(year=2017, month=1, day=5))
1029+
1030+
df = pd.DataFrame(0, index=trange, columns=["A", "B"])
1031+
bool_idx = np.array([False, False, False, False, False, True])
1032+
1033+
# modify the value for the duplicate index entry
1034+
df.loc[trange[bool_idx], "A"] = 7
1035+
1036+
assert df['A'][4] == 7
1037+
assert df['A'][5] == 7

0 commit comments

Comments
 (0)