BUG: cant modify df with duplicate index (#17105) #20939

fersarr · 2018-05-03T08:28:15Z

closes indexing.py: "'bool' object has no attribtute 'any'" with duplicate time index #17105
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Fixing to allow the modification of DataFrames that have duplicate elements in the index. Previously it would fail with

AttributeError: 'bool' object has no attribute 'any'

See #17105 for a code snippet.

Replacing zeros_like(objarray) with zeros() because the first unnecessarily returns an array of zeros with the same types as objarray. We only want the zeros, not the type, to be able to later compare against -1 and get an array as a result:

The result of zeros_like() with dates gives a boolean after comparison

>>> myarr_fromindex = np.zeros_like(pd.DatetimeIndex([2,3]))
>>> myarr_fromindex
array(['1970-01-01T00:00:00.000000000', '1970-01-01T00:00:00.000000000'],
      dtype='datetime64[ns]')
>>> 
>>> type(myarr_fromindex)
<type 'numpy.ndarray'>
>>>
>>> myarr_fromindex == -1
False

The result of zeros_like() with numbers gives an array after comparison

>>> 
>>> 
>>> myarr_fromarr = np.zeros_like([2,3])
>>> myarr_fromarr
array([0, 0])
>>> type(myarr_fromarr)
<type 'numpy.ndarray'>
>>> myarr_fromarr == -1
array([False, False])
>>>

codecov · 2018-05-03T09:13:38Z

Codecov Report

❗ No coverage uploaded for pull request base (master@620784f). Click here to learn what that means.
The diff coverage is 100%.

@@            Coverage Diff            @@
##             master   #20939   +/-   ##
=========================================
  Coverage          ?   91.81%           
=========================================
  Files             ?      153           
  Lines             ?    49479           
  Branches          ?        0           
=========================================
  Hits              ?    45428           
  Misses            ?     4051           
  Partials          ?        0

Flag	Coverage Δ
#multiple	`90.2% <100%> (?)`
#single	`41.85% <0%> (?)`

Impacted Files	Coverage Δ
pandas/core/indexing.py	`93.55% <100%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 620784f...cf5ec7d. Read the comment docs.

jreback · 2018-05-03T10:23:47Z

doc/source/whatsnew/v0.23.0.txt

@@ -1243,6 +1243,7 @@ Indexing
 - Bug in ``Series.is_unique`` where extraneous output in stderr is shown if Series contains objects with ``__ne__`` defined (:issue:`20661`)
 - Bug in ``.loc`` assignment with a single-element list-like incorrectly assigns as a list (:issue:`19474`)
 - Bug in partial string indexing on a ``Series/DataFrame`` with a monotonic decreasing ``DatetimeIndex`` (:issue:`19362`)
+- Fixed to allow modifying ``DataFrame`` with duplicate Index (:issue:`17105`)


Bug in performing in-place operations on a DataFrame with a duplicate Index.

jreback · 2018-05-03T10:24:16Z

pandas/tests/indexing/test_indexing.py

+
+
+def test_modify_with_duplicate_index():
+    trange = pd.date_range(start=pd.Timestamp(year=2017, month=1, day=1),


can you add the issue number here. move the test to test_loc.py (same dir)

jreback · 2018-05-03T10:24:50Z

pandas/tests/indexing/test_indexing.py

+
+    # modify the value for the duplicate index entry
+    df.loc[trange[bool_idx], "A"] = 7
+


use assert_frame_equal; construct the expected frame and compare

jreback · 2018-05-03T10:25:56Z

pandas/tests/indexing/test_indexing.py

+    df = pd.DataFrame(0, index=trange, columns=["A", "B"])
+    bool_idx = np.array([False, False, False, False, False, True])
+
+    # modify the value for the duplicate index entry


this is fine as a test, but please add another case like the original issue (e.g. +=)

fersarr · 2018-05-03T15:21:53Z

thanks! added the requested changes, hopefully its better now :)

jreback · 2018-05-04T10:08:24Z

moved the test around, will merge on green.

jreback · 2018-05-08T00:23:39Z

thanks @fersarr

xref pandas-dev#20939

xref #20939

jreback requested changes May 3, 2018

View reviewed changes

jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels May 3, 2018

BUG: cant modify df with duplicate index (pandas-dev#17105)

094761f

fersarr force-pushed the master branch from f9694d2 to 094761f Compare May 3, 2018 15:20

jreback added 3 commits May 4, 2018 06:03

Merge branch 'master' into PR_TOOL_MERGE_PR_20939

38b9ed4

clean up tests

0a72675

move

cf62ea3

jreback added this to the 0.23.0 milestone May 4, 2018

jreback approved these changes May 4, 2018

View reviewed changes

Merge branch 'master' into PR_TOOL_MERGE_PR_20939

cf5ec7d

jreback merged commit d15c104 into pandas-dev:master May 8, 2018

jreback added a commit to jreback/pandas that referenced this pull request May 8, 2018

COMPAT: 32-bit indexing compat

2bb5985

xref pandas-dev#20939

jreback added a commit to jreback/pandas that referenced this pull request May 8, 2018

COMPAT: 32-bit indexing compat

09c1381

xref pandas-dev#20939

jreback mentioned this pull request May 8, 2018

COMPAT: 32-bit indexing compat #20977

Merged

jreback added a commit that referenced this pull request May 8, 2018

COMPAT: 32-bit indexing compat (#20977)

3dd90a2

xref #20939

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: cant modify df with duplicate index (#17105) #20939

BUG: cant modify df with duplicate index (#17105) #20939

Uh oh!

fersarr commented May 3, 2018 •

edited

Loading

Uh oh!

codecov bot commented May 3, 2018 •

edited

Loading

Uh oh!

jreback May 3, 2018

Uh oh!

jreback May 3, 2018

Uh oh!

jreback May 3, 2018

Uh oh!

jreback May 3, 2018

Uh oh!

fersarr commented May 3, 2018

Uh oh!

jreback commented May 4, 2018

Uh oh!

jreback commented May 8, 2018

Uh oh!

Uh oh!



		def test_modify_with_duplicate_index():
		trange = pd.date_range(start=pd.Timestamp(year=2017, month=1, day=1),


		# modify the value for the duplicate index entry
		df.loc[trange[bool_idx], "A"] = 7

Uh oh!

BUG: cant modify df with duplicate index (#17105) #20939

BUG: cant modify df with duplicate index (#17105) #20939

Uh oh!

Conversation

fersarr commented May 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented May 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jreback May 3, 2018

Choose a reason for hiding this comment

Uh oh!

jreback May 3, 2018

Choose a reason for hiding this comment

Uh oh!

jreback May 3, 2018

Choose a reason for hiding this comment

Uh oh!

jreback May 3, 2018

Choose a reason for hiding this comment

Uh oh!

fersarr commented May 3, 2018

Uh oh!

jreback commented May 4, 2018

Uh oh!

jreback commented May 8, 2018

Uh oh!

Uh oh!

fersarr commented May 3, 2018 •

edited

Loading

codecov bot commented May 3, 2018 •

edited

Loading