Skip to content

BUG: np array indexer modifed in iloc #21867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jul 14, 2018

Conversation

ydovzhenko
Copy link
Contributor

@@ -2586,6 +2587,7 @@ def maybe_convert_indices(indices, n):
IndexError : one of the converted indices either exceeded the number
of elements (specified by `n`) OR was still negative.
"""
indices = deepcopy(indices)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done down on line 2601? Avoid a copy if at all possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok moving deepcopy to 2601 makes sense; will do. Can't see a way to avoid copying altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to just use .copy() instead of importing deepcopy, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, true. Will change to .copy()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can get rid of this now

@jbrockmendel
Copy link
Member

jbrockmendel commented Jul 11, 2018

Pls implement a test for this.

@codecov
Copy link

codecov bot commented Jul 11, 2018

Codecov Report

Merging #21867 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21867      +/-   ##
==========================================
+ Coverage   91.91%   91.91%   +<.01%     
==========================================
  Files         164      164              
  Lines       49992    49993       +1     
==========================================
+ Hits        45951    45952       +1     
  Misses       4041     4041
Flag Coverage Δ
#multiple 90.3% <100%> (ø) ⬆️
#single 42.16% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexing.py 93.73% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 365eac4...4ce3a47. Read the comment docs.

@pep8speaks
Copy link

pep8speaks commented Jul 11, 2018

Hello @ydovzhenko! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on July 13, 2018 at 19:38 Hours UTC

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a whatsnew for 0.24.0 bug fixes in indexing.

@@ -2596,6 +2598,8 @@ def maybe_convert_indices(indices, n):

mask = indices < 0
if mask.any():
# indices is np.array, so we don't use deepcopy
indices = np.copy(indices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use indices.copy()

@@ -2596,6 +2598,8 @@ def maybe_convert_indices(indices, n):

mask = indices < 0
if mask.any():
# indices is np.array, so we don't use deepcopy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need the comment

@@ -126,6 +126,24 @@ def test_iloc_getitem_neg_int(self):
typs=['labels', 'mixed', 'ts', 'floats', 'empty'],
fails=IndexError)

def test_iloc_array_not_mutating_negative_indices(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number here as a comment

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Jul 12, 2018
@@ -126,6 +126,24 @@ def test_iloc_getitem_neg_int(self):
typs=['labels', 'mixed', 'ts', 'floats', 'empty'],
fails=IndexError)

def test_iloc_array_not_mutating_negative_indices(self):

# Test for issue 21867. Assert that iloc does not modify index array if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can simplify this to just GH 21867

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean change the function name or the comment? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment

'C': [106, 107, 108]},
index=[1, 2, 3])
df.iloc[array_with_neg_numbers]
assert np.all(array_with_neg_numbers == array_copy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use tm.assert_numpy_array_equal instead both here and below

@@ -2586,6 +2587,7 @@ def maybe_convert_indices(indices, n):
IndexError : one of the converted indices either exceeded the number
of elements (specified by `n`) OR was still negative.
"""
indices = deepcopy(indices)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can get rid of this now

@@ -1,4 +1,5 @@
# pylint: disable=W0223
from copy import deepcopy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor comments otherwise lgtm

@@ -363,8 +363,7 @@ Indexing
- ``DataFrame.__getitem__`` now accepts dictionaries and dictionary keys as list-likes of labels, consistently with ``Series.__getitem__`` (:issue:`21294`)
- Fixed ``DataFrame[np.nan]`` when columns are non-unique (:issue:`21428`)
- Bug when indexing :class:`DatetimeIndex` with nanosecond resolution dates and timezones (:issue:`11679`)

-
- Bug when indexing with a numpy array containing negative values, fixed so the array is not mutated (:issue:`21867`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be sure to capitalize NumPy? Can also reword to Bug where indexing with a Numpy array containing negative values would mutate the indexer

@@ -126,6 +126,23 @@ def test_iloc_getitem_neg_int(self):
typs=['labels', 'mixed', 'ts', 'floats', 'empty'],
fails=IndexError)

def test_iloc_array_not_mutating_negative_indices(self):

# GH 21867.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit but don't need the period


df.iloc[:, array_with_neg_numbers]
tm.assert_numpy_array_equal(array_with_neg_numbers, array_copy)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should only need one blank line here (think this will fail LINTing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find LINTing in the contributing guide. Can I LINT myself and how exactly do you guys do that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://pandas-docs.github.io/pandas-docs-travis/contributing.html#python-pep8 has the quick version, but that doesn't check everything.

The actual command run during CI is LINT=1 ci/lint.sh, but that takes longer and may have additional dependencies.

https://travis-ci.org/pandas-dev/pandas/jobs/403382274#L2928 is the line causing this PR to fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake8 pandas/tests/indexing/test_iloc.py should work for you.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of the LINTing error this lgtm

@jreback jreback added this to the 0.24.0 milestone Jul 14, 2018
@jreback jreback merged commit dcd9e6c into pandas-dev:master Jul 14, 2018
@jreback
Copy link
Contributor

jreback commented Jul 14, 2018

thanks @ydovzhenko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

np array indexer modified in iloc
6 participants