TST: Remove even more uses np.array_equal in tests #18087

gfyoung · 2017-11-03T05:37:40Z

Implement assert_not as a way to check that assertions should fail (for methods more sophisticated than a simple bare assert). Also takes the opportunity to remove several more np.array_equal assertions.

Follow-up to #18047

codecov · 2017-11-03T07:15:58Z

Codecov Report

Merging #18087 into master will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #18087      +/-   ##
==========================================
- Coverage   91.27%   91.23%   -0.05%     
==========================================
  Files         163      163              
  Lines       50120    50120              
==========================================
- Hits        45749    45728      -21     
- Misses       4371     4392      +21

Flag	Coverage Δ
#multiple	`89.04% <ø> (-0.03%)`	⬇️
#single	`40.32% <ø> (-0.06%)`	⬇️

Impacted Files	Coverage Δ
pandas/util/testing.py	`100% <ø> (ø)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/plotting/_converter.py	`63.38% <0%> (-1.82%)`	⬇️
pandas/core/frame.py	`97.75% <0%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b4375bd...ac69b46. Read the comment docs.

codecov · 2017-11-03T07:16:06Z

Codecov Report

Merging #18087 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #18087      +/-   ##
==========================================
- Coverage    91.4%   91.38%   -0.02%     
==========================================
  Files         164      164              
  Lines       49880    49880              
==========================================
- Hits        45592    45583       -9     
- Misses       4288     4297       +9

Flag	Coverage Δ
#multiple	`89.19% <ø> (ø)`	⬆️
#single	`39.42% <ø> (-0.07%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.8% <0%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 148ed63...85ba491. Read the comment docs.

jreback · 2017-11-03T23:13:42Z

pandas/tests/frame/test_operators.py

@@ -237,7 +237,7 @@ def test_modulo(self):
        s = p[0]
        res = s % p
        res2 = p % s
-        assert not np.array_equal(res.fillna(0), res2.fillna(0))


so another way to do this and maybe cleaner is to have a parameter compare='equal|'not_equal'
The main issue I have with this method is the error messages are not as good (though maybe it doesn't matter)

IIUC, wouldn't I have to propagate that parameter across every single assert_* function? My change allow for any assert_* function to be passed in.

Also, I'm not sure I follow what error message you would expect. If the assert_* function passes when it isn't supposed to, you can't specify what was "different" in this case.

@jreback : As I explained why I wasn't really a fan passing in a compare parameter to all of these assert_* functions, if you have any other thoughts on this point, let me know. Otherwise, I think this should be good to go.

@jreback : Any thoughts?

jreback · 2017-11-06T13:47:39Z

cc @jorisvandenbossche @TomAugspurger

gfyoung · 2017-11-10T17:04:57Z

@jorisvandenbossche @TomAugspurger : Any thoughts?

TomAugspurger · 2017-11-13T14:58:01Z

Yeah, this seems good.

jorisvandenbossche · 2017-11-13T15:21:19Z

Is there a reason we can't use assert not s.equals(s2) ?

gfyoung · 2017-11-13T16:43:51Z

@jorisvandenbossche : Hmm...that's a good point. How comprehensive is it vs. tm.assert_series_equal ?

gfyoung · 2017-11-13T16:44:36Z

In any case, I'm still in favor of using the method I have developed in cases where our equality checking isn't as strict (e.g. check_dtype=False)

jorisvandenbossche · 2017-11-13T17:36:55Z

equals only looks if the data are equal, not the attributes (eg not the Series name), so it less strict than assert_series_equal.
But I don't think that this is a problem in this case. As because if a different name is the reason you want assert_series_equal to fail, you shouldn't use this, but just explicitly assert that the names are not equal.

So personally I don't find that an argument to have assert_not(assert_series_equal(..)) (but there might be other arguments)

gfyoung · 2017-11-13T21:24:35Z

@jorisvandenbossche : Fair point. This function is meant to be catch-all for any of our in-house assert_* functions, and initially it was meant to address our not having a counter-part for assert_numpy_array_equal.

That being said, assert_not(assert_series_equal(...)) could indeed be simplified as you suggested, and I can make that change later today.

jreback · 2017-11-14T13:28:04Z

lgtm. @jorisvandenbossche ?

jorisvandenbossche · 2017-11-14T14:50:34Z

I am still not sure adding an assert_not to our own testing addons is worth it for the current single use case (and even for that one, I personally find assert not np.array_equal(df.values, b_c) much more natural to read than tm.assert_not(tm.assert_numpy_array_equal, df.values, b_c))

gfyoung · 2017-11-14T16:19:23Z

@jorisvandenbossche : The whole point is not to use np.array_equal in the first place, as our testing methods with assert_numpy_array_equal are richer. This was a more generalized solution rather than writing a one-off solution of assert_not_numpy_array_equal.

In any case, it just so happened that Series equality checking was sufficient in many of the cases where we were using np.array_equal, but this construct that I created can be used for those cases (albeit not so frequent) where we might want to check that a certain assert should fail.

gfyoung · 2017-11-14T16:24:44Z

@jorisvandenbossche : If you haven't done so already, have a look at #18047, where this whole discussion started around array_equal.

jorisvandenbossche · 2017-11-14T17:11:56Z

I didn't see that, but that issue does not seem to give a reasoning for the need to remove array_equal?
And if that is the goal, I would use np.allclose

gfyoung · 2017-11-14T19:12:14Z

I didn't see that, but that issue does not seem to give a reasoning for the need to remove array_equal?

I think the idea (and @jreback feel free to jump in) is that we were avoiding explicitly using the numpy API for our testing purposes and only using in-house comparisons during testing. That's the emphasis that I have seen (and received feedback on over the course of previous PR's).

As written, np.array_equal works to get the tests passing, but we would rather check with a different function than that.

jorisvandenbossche · 2017-11-14T20:21:37Z

I am not sure if that is the general direction, eg with the move to pytest we actually moved a bit more away from our in-house testing functionality (and you did some of those PRs to replace in-house methods with standard pytest functionality).
So I don't see why we are adding here new in-house functionality to maintain while there is a perfectly sensible solution with standard tools of numpy and pytest (unless there is a clear benefit of an in-house method, which is sometimes the case, but I don't see that here)

gfyoung · 2017-11-14T20:33:49Z

I'm fine with incorporating pytest tools in-place of in-house. However, numpy tools in-place of in-house, not so much. More often than not, we can use something better than array_equal when testing, as you can see in this PR and the one preceding it.

So if you have a suggestion for writing this in pytest idiom, I'm all for it. However, if we can't, then I'd rather do this in-house than rely on numpy tools.

jorisvandenbossche · 2017-11-14T20:41:51Z

More often than not, we can use something better than array_equal when testing, as you can see in this PR and the one preceding it.

Can you then be more specific? What is 'bad' about array_equal in this specific use case?

So if you have a suggestion for writing this in pytest idiom, I'm all for it. However, if we can't, then I'd rather do this in-house than rely on numpy tools.

I think assert not np.array_equal(..) or assert not np.allclose(..) is a perfect pytest idiom

gfyoung · 2017-11-14T21:10:08Z

@jorisvandenbossche : In this specific case, I'll give it to you that we actually don't need this paradigm at all, as the shape of the arrays are completely different (just ran through the test manually), making this array_equal check kind of ridiculous.

So in that case, no need to implement assert_not for this PR, and we can table that discussion for another time if need be.

That being said, array_equal, similar to Series.equal only checks values and shape, but not dtype, which is what assert_numpy_array_equal does.

jorisvandenbossche · 2017-11-14T21:16:10Z

That being said, array_equal, similar to Series.equal only checks values and shape, but not dtype, which is what assert_numpy_array_equal does.

Yes, but as I said above, IMO if the reason that you want assert_series/numpy_array_equal to fail is not because the values differ but because the dtype (or name, ..) differs, I find assert_not(assert_...()) very obscure and I think we should explicitly test that attribute.

jorisvandenbossche · 2017-11-14T21:17:56Z

So my reasoning: either the values differ -> simply use .equals / array_equal, or either the values are equal but the dtype, name, ... differ -> test that explicitly

gfyoung · 2017-11-15T17:19:25Z

@jorisvandenbossche : I just renamed the PR to just remove instances of "array_equal" from tests. Everything is green, so PTAL.

jreback · 2017-11-16T00:23:57Z

thanks @gfyoung

jreback · 2017-11-16T00:24:55Z

just for edification. we don't like np.array_equal because it doesn't work with NaNs; this is why we have array_equivalent, which is you can use as well in tests.

gfyoung · 2017-11-16T04:57:11Z

this is why we have array_equivalent

All this time, and I didn't know that existed. 😄 Should mention that from now on in case we have this situation down the road.

Follow-up to pandas-devgh-18087.

jorisvandenbossche · 2017-11-16T08:12:39Z

Thanks @gfyoung

we don't like np.array_equal because it doesn't work with NaNs; this is why we have array_equivalent, which is you can use as well in tests.

and on the numpy side, I think allclose is the 'replacement' for array_equal that also can handle NaNs

Follow-up to pandas-devgh-18087.

Follow-up to gh-18087. The remaining test that uses the function call uses functionality that doesn't exist anymore in pandas.

gfyoung added the Testing pandas testing functions or related to the test suite label Nov 3, 2017

gfyoung added this to the 0.21.1 milestone Nov 3, 2017

jreback reviewed Nov 3, 2017

View reviewed changes

gfyoung force-pushed the not-numpy-array-equal branch from ac69b46 to f49afb9 Compare November 14, 2017 07:32

jreback modified the milestones: 0.21.1, 0.22.0 Nov 14, 2017

TST: Remove even more uses np.array_equal in tests

85ba491

gfyoung force-pushed the not-numpy-array-equal branch from f49afb9 to 85ba491 Compare November 15, 2017 05:18

gfyoung changed the title ~~TST: Implement assert_not in testing.py~~ TST: Remove even more uses np.array_equal in tests Nov 15, 2017

jreback merged commit 2a6023e into pandas-dev:master Nov 16, 2017

gfyoung deleted the not-numpy-array-equal branch November 16, 2017 04:55

gfyoung added a commit to forking-repos/pandas that referenced this pull request Nov 16, 2017

MAINT: Blacklist np.array_equal in tests

9cf1c97

Follow-up to pandas-devgh-18087.

gfyoung mentioned this pull request Nov 16, 2017

MAINT: Blacklist np.array_equal in tests #18318

Merged

gfyoung added a commit to forking-repos/pandas that referenced this pull request Nov 16, 2017

MAINT: Blacklist np.array_equal in tests

d3b84a2

Follow-up to pandas-devgh-18087.

gfyoung added a commit that referenced this pull request Nov 16, 2017

MAINT: Blacklist np.array_equal in tests (#18318)

a26b676

Follow-up to gh-18087. The remaining test that uses the function call uses functionality that doesn't exist anymore in pandas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: Remove even more uses np.array_equal in tests #18087

TST: Remove even more uses np.array_equal in tests #18087

gfyoung commented Nov 3, 2017

codecov bot commented Nov 3, 2017

codecov bot commented Nov 3, 2017 •

edited

Loading

jreback Nov 3, 2017

gfyoung Nov 3, 2017 •

edited

Loading

gfyoung Nov 4, 2017 •

edited

Loading

gfyoung Nov 5, 2017

jreback commented Nov 6, 2017

gfyoung commented Nov 10, 2017

TomAugspurger commented Nov 13, 2017

jorisvandenbossche commented Nov 13, 2017

gfyoung commented Nov 13, 2017

gfyoung commented Nov 13, 2017 •

edited

Loading

jorisvandenbossche commented Nov 13, 2017

gfyoung commented Nov 13, 2017 •

edited

Loading

jreback commented Nov 14, 2017

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017 •

edited

Loading

gfyoung commented Nov 14, 2017 •

edited

Loading

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017 •

edited

Loading

jorisvandenbossche commented Nov 14, 2017 •

edited

Loading

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 15, 2017

jreback commented Nov 16, 2017

jreback commented Nov 16, 2017

gfyoung commented Nov 16, 2017

jorisvandenbossche commented Nov 16, 2017

TST: Remove even more uses np.array_equal in tests #18087

TST: Remove even more uses np.array_equal in tests #18087

Conversation

gfyoung commented Nov 3, 2017

codecov bot commented Nov 3, 2017

Codecov Report

codecov bot commented Nov 3, 2017 • edited Loading

Codecov Report

jreback Nov 3, 2017

Choose a reason for hiding this comment

gfyoung Nov 3, 2017 • edited Loading

Choose a reason for hiding this comment

gfyoung Nov 4, 2017 • edited Loading

Choose a reason for hiding this comment

gfyoung Nov 5, 2017

Choose a reason for hiding this comment

jreback commented Nov 6, 2017

gfyoung commented Nov 10, 2017

TomAugspurger commented Nov 13, 2017

jorisvandenbossche commented Nov 13, 2017

gfyoung commented Nov 13, 2017

gfyoung commented Nov 13, 2017 • edited Loading

jorisvandenbossche commented Nov 13, 2017

gfyoung commented Nov 13, 2017 • edited Loading

jreback commented Nov 14, 2017

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017 • edited Loading

gfyoung commented Nov 14, 2017 • edited Loading

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 14, 2017 • edited Loading

jorisvandenbossche commented Nov 14, 2017 • edited Loading

jorisvandenbossche commented Nov 14, 2017

gfyoung commented Nov 15, 2017

jreback commented Nov 16, 2017

jreback commented Nov 16, 2017

gfyoung commented Nov 16, 2017

jorisvandenbossche commented Nov 16, 2017

codecov bot commented Nov 3, 2017 •

edited

Loading

gfyoung Nov 3, 2017 •

edited

Loading

gfyoung Nov 4, 2017 •

edited

Loading

gfyoung commented Nov 13, 2017 •

edited

Loading

gfyoung commented Nov 13, 2017 •

edited

Loading

gfyoung commented Nov 14, 2017 •

edited

Loading

gfyoung commented Nov 14, 2017 •

edited

Loading

gfyoung commented Nov 14, 2017 •

edited

Loading

jorisvandenbossche commented Nov 14, 2017 •

edited

Loading