Skip to content

MAINT: Remove np.array_equal calls in tests #18047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 2, 2017

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Oct 31, 2017

Preferable to use tm.assert_numpy_array_equal instead.

@gfyoung gfyoung added the Testing pandas testing functions or related to the test suite label Oct 31, 2017
@gfyoung gfyoung added this to the 0.21.1 milestone Oct 31, 2017
@gfyoung
Copy link
Member Author

gfyoung commented Oct 31, 2017

A couple of questions going forward:

  1. There are several instances where they used np.array_equal to check non-array inequality. Would it be useful to create a method called tm.assert_numpy_array_not_equal for this case?

  2. tm.assert_numpy_array_equal enforces that we must be comparing two numpy arrays, which makes it slightly cumbersome at times to compare two arrays (e.g. they are lists). Perhaps we could take a page out of numpy book and create a function called tm.assert_array_equal ?

@gfyoung gfyoung force-pushed the numpy-array-equal-remove branch from c5b4ada to 4d21ad1 Compare October 31, 2017 06:26
@jreback
Copy link
Contributor

jreback commented Oct 31, 2017

There are several instances where they used np.array_equal to check non-array inequality. Would it be useful to create a method called tm.assert_numpy_array_not_equal for this case?

this has come up a couple of times, sure I guess that would be ok

tm.assert_numpy_array_equal enforces that we must be comparing two numpy arrays, which makes it slightly cumbersome at times to compare two arrays (e.g. they are lists). Perhaps we could take a page out of numpy book and create a function called tm.assert_array_equal ?

no, list comparison is directly handled by assert

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add np.array_equal to the black-list in ci/lint.sh (near the end), we check for using black-listed functions and fail the build.

@@ -22,7 +22,10 @@

def eq_gen_range(kwargs, expected):
rng = generate_range(**kwargs)
assert (np.array_equal(list(rng), expected))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this function and just directly compare these (only 3 uses)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

expected = {0: ['a', 'd', 'g', 'l'],
1: ['b', 'e', 'h', 'm'],
2: ['c', 'f', 'i', 'n']}
expected = {0: np.array(['a', 'd', 'g', 'l'], dtype=object),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assert can do this; don't convert to a numpy array just to compare

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output of Textreader.read is an array, so would need to convert.

@@ -225,7 +225,7 @@ def test_escapechar(self):
reader = TextReader(StringIO(data), delimiter=',', header=None,
escapechar='\\')
result = reader.read()
expected = {0: ['"hello world"'] * 3}
expected = {0: np.array(['"hello world"'] * 3, dtype=object)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@gfyoung
Copy link
Member Author

gfyoung commented Oct 31, 2017

need to add np.array_equal to the black-list in ci/lint.sh (near the end)

There are many instances of this method being called in the source code...I think it shouldn't be allowed in tests BUT be careful with source code though.

@jreback
Copy link
Contributor

jreback commented Oct 31, 2017

There are many instances of this method being called in the source code...I think it shouldn't be allowed in tests BUT be careful with source code though.

that line in lint should be

grep -r -E --include '*.py' --exclude testing.py '(numpy|np)\.testing' pandas\tests
(and have to add in the check for array_equal in the regex)

@jreback
Copy link
Contributor

jreback commented Oct 31, 2017

e.g.

(pandas) bash-3.2$   grep -r -E --include '*.py' --exclude testing.py '(numpy|np)\.testing|(numpy|np)\.array_equal' pandas/tests
pandas/tests/dtypes/test_cast.py:        assert (np.array_equal(result, arr))
pandas/tests/dtypes/test_cast.py:        assert (np.array_equal(result, expected))

@gfyoung gfyoung force-pushed the numpy-array-equal-remove branch from 4d21ad1 to c7d2e38 Compare November 1, 2017 04:42
@codecov
Copy link

codecov bot commented Nov 1, 2017

Codecov Report

Merging #18047 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18047      +/-   ##
==========================================
- Coverage   91.25%   91.23%   -0.02%     
==========================================
  Files         163      163              
  Lines       50115    50114       -1     
==========================================
- Hits        45730    45720      -10     
- Misses       4385     4394       +9
Flag Coverage Δ
#multiple 89.04% <ø> (-0.01%) ⬇️
#single 40.24% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️
pandas/tslib.py 100% <0%> (ø) ⬆️
pandas/compat/pickle_compat.py 75.6% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 881ee30...c7d2e38. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 1, 2017

Codecov Report

Merging #18047 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18047      +/-   ##
==========================================
- Coverage   91.25%   91.23%   -0.03%     
==========================================
  Files         163      163              
  Lines       50120    50115       -5     
==========================================
- Hits        45737    45721      -16     
- Misses       4383     4394      +11
Flag Coverage Δ
#multiple 89.04% <ø> (-0.01%) ⬇️
#single 40.23% <ø> (-0.15%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/io/html.py 85.17% <0%> (-0.81%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️
pandas/core/groupby.py 92.03% <0%> (-0.01%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15fa4bd...902f4e9. Read the comment docs.

@gfyoung gfyoung force-pushed the numpy-array-equal-remove branch from c7d2e38 to 9adf991 Compare November 1, 2017 09:27
@jreback jreback modified the milestones: 0.21.1, 0.22.0 Nov 1, 2017
@jreback
Copy link
Contributor

jreback commented Nov 1, 2017

@gfyoung can you do this so we can ensure these don't appear going forward

@gfyoung
Copy link
Member Author

gfyoung commented Nov 1, 2017

@jreback : There are several instances that I did not remove yet because I want to do a more thorough refactoring in light of my comment above

I'll do that in a follow-up PR if that works.

@gfyoung gfyoung force-pushed the numpy-array-equal-remove branch from 9adf991 to 902f4e9 Compare November 1, 2017 17:11
@jreback jreback merged commit 194dbff into pandas-dev:master Nov 2, 2017
@jreback
Copy link
Contributor

jreback commented Nov 2, 2017

that sounds good @gfyoung thanks!

@gfyoung gfyoung deleted the numpy-array-equal-remove branch November 2, 2017 16:06
@gfyoung gfyoung modified the milestones: 0.22.0, 0.21.1 Nov 3, 2017
jreback added a commit to jreback/pandas that referenced this pull request Nov 3, 2017
jreback added a commit that referenced this pull request Nov 3, 2017
ghost pushed a commit to reef-technologies/pandas that referenced this pull request Nov 3, 2017
ghost pushed a commit to reef-technologies/pandas that referenced this pull request Nov 3, 2017
1kastner pushed a commit to 1kastner/pandas that referenced this pull request Nov 5, 2017
1kastner pushed a commit to 1kastner/pandas that referenced this pull request Nov 5, 2017
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Dec 8, 2017
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Dec 8, 2017
TomAugspurger pushed a commit that referenced this pull request Dec 11, 2017
TomAugspurger pushed a commit that referenced this pull request Dec 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants