Use fixtures in pandas/tests/base #32046

SaturnFromTitan · 2020-02-17T00:50:11Z

tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff

Note: The diff is a bit inflated. Most changes are just indentation because for loops are replaced by a parametrized fixture.

pandas/conftest.py

WillAyd

looks generally good (I think the naming of the fixture is a little strange, though admittedly don't have anything better to offer atm)

A few questions

pandas/tests/base/test_ops.py

SaturnFromTitan · 2020-02-20T21:06:51Z

I think I addressed all your comments here @WillAyd

WillAyd · 2020-02-20T22:13:24Z

pandas/tests/indexes/common.py

        first = indices[2:]
        second = indices[:4]
-        answer = indices[4:]
+        if isinstance(indices, CategoricalIndex) or indices.is_boolean():


Is this change related?

Yes, it is. To make the indices fixture more useful I enhanced its length from 2 to 10 entries, which impacts the expected value here. In the same way, I could provide a proper "expected" for CategoricalIndex instead of just silently skipping it.

pandas/conftest.py

pandas/tests/indexes/common.py

pandas/conftest.py

pandas/tests/base/test_ops.py

jbrockmendel · 2020-02-21T17:08:07Z

pandas/tests/base/test_ops.py

+    def test_none_comparison(self, series_with_differing_index):
+        o = series_with_differing_index
+        if isinstance(o.index, IntervalIndex):
+            pytest.skip("IntervalIndex is immutable")


what does immutability have to do with this? all indexes should be immutable

I'll have to check. I think the IntervalIndex doesn't implement .loc or something like that.

just a hunch, id imagine its like the datetime64 case where it raises for inequalities (in which case it can be added to the special case for datetime64 below)

No, it's failing on line 149 o[0] = np.nan already. The assignment fails with

AttributeError: 'pandas._libs.interval.IntervalTree' object has no attribute 'get_loc'

I added a comment and adjusted the skip message. Please let me know if you have a better idea

pandas/tests/base/test_ops.py

jbrockmendel · 2020-02-21T17:17:54Z

@SaturnFromTitan if its easy for you, can you point out sections of the diff that are effectively just-de-indentation, should make it easier to focus review

SaturnFromTitan · 2020-02-23T12:46:30Z

pandas/tests/base/test_ops.py

@@ -137,227 +137,238 @@ def setup_method(self, method):
        self.is_valid_objs = self.objs
        self.not_valid_objs = []

-    def test_none_comparison(self):
+    def test_none_comparison(self, f8series_any_simple_index):


Fixed indentation and added skips

SaturnFromTitan · 2020-02-23T12:48:45Z

pandas/tests/base/test_ops.py


-            assert o.nunique() == len(np.unique(o.values))
+        # dropna=True would break for MultiIndex
+        assert o.nunique(dropna=False) == len(np.unique(o.values))


added the dropna=False and a comment explaining why

SaturnFromTitan · 2020-02-23T12:50:50Z

pandas/tests/base/test_ops.py


-    def test_ndarray_compat_properties(self):
+    def test_ndarray_compat_properties(self, index_or_series_obj):


Just fixed indentation

SaturnFromTitan · 2020-02-23T12:51:23Z

pandas/tests/base/test_ops.py

-
-            expected_s = Series(
-                range(10, 0, -1), index=expected_index, dtype="int64", name="a"
+    def test_value_counts_unique_nunique(self, index_or_series_obj):


Fixed indentation and added xfail. All other changes are marked with comments

SaturnFromTitan · 2020-02-23T12:52:06Z

pandas/tests/base/test_ops.py

+        assert o.dtype == orig.dtype
+
+        expected_s = Series(
+            range(len(orig), 0, -1), index=expected_index, dtype="int64"


Made the length of range dynamic as not all values of index_or_series_obj share the same length

SaturnFromTitan · 2020-02-23T12:52:51Z

pandas/tests/base/test_ops.py

-            elif needs_i8_conversion(o):
-                values[0:2] = iNaT
-                values = o._shallow_copy(values)
+    def test_value_counts_unique_nunique_null(self, null_obj, index_or_series_obj):


Fixed indentation and added skips. All other changes are marked with comments

SaturnFromTitan · 2020-02-23T12:55:45Z

pandas/tests/base/test_ops.py

+        expected_data = list(range(num_values, 2, -1))
+        expected_data_na = expected_data.copy()
+        if expected_data_na:
+            expected_data_na.append(3)


length of expected_data is now dynamic

using append instead of inline if/else statement

SaturnFromTitan · 2020-02-23T12:57:21Z

pandas/tests/base/test_ops.py

+            assert result.dtype == orig.dtype
+
+        assert o.nunique() == max(0, num_values - 2)
+        assert o.nunique(dropna=False) == max(0, num_values - 1)


comparing against dynamic value depending on the length of the fixture value

using max(0, ...) so tests don't break with the empty index

SaturnFromTitan · 2020-02-23T12:58:24Z

I addressed all your comments @jbrockmendel
Also, I added comments for all real changes in the tests to aid the review

jreback

looks good, some comments. ping on green.

pandas/conftest.py

pandas/tests/base/test_ops.py

jreback · 2020-02-23T15:07:27Z

pandas/tests/base/test_ops.py

+        values = o._ndarray_values
+        num_values = len(orig)
+
+        if not allow_na_ops(o):


@WillAyd don't we have a way to avoid using skips inside the test itself and rather do this externally? I don't recall exactly

I think you can only do this via the pytest.mark.skip decorator. But this approach doesn't work here, because we need a condition on the parametrized fixture value. A quick Google search also didn't bring up anything better

jreback · 2020-02-23T15:07:59Z

pandas/tests/base/test_ops.py

+            expected_index.name = None
+            o = o.repeat(range(1, len(o) + 1))
+            o.name = "a"
+


future PR, let's break these huge tests up

SaturnFromTitan · 2020-02-23T17:00:11Z

@jreback CI is green and I addressed all your comments

jreback

going to merge, but there are a number of weird constructs / duplications here, pls address these as a followup (or open an issue)

jreback · 2020-02-23T17:03:14Z

pandas/tests/base/test_ops.py

+
+        # special assign to the numpy array
+        if is_datetime64tz_dtype(obj):
+            if isinstance(obj, DatetimeIndex):


this can be simplified; the else path should work for all

xref: #32205

jreback · 2020-02-23T17:03:32Z

pandas/tests/base/test_ops.py

-            if isinstance(o, (DatetimeIndex, PeriodIndex)):
-                expected_index = o.copy()
-                expected_index.name = None
+        elif needs_i8_conversion(obj):


this is also duplicative here

jreback · 2020-02-23T17:04:41Z

pandas/tests/base/test_ops.py

+            obj = klass(values.repeat(range(1, len(obj) + 1)))
+            obj.name = "a"
+        else:
+            if isinstance(obj, DatetimeIndex):


this can't possibly be true (as the first if already catches this)

jreback · 2020-02-23T17:06:58Z

thanks. ideally very targeted PRs are best as we can merge quickly.

SaturnFromTitan · 2020-02-23T17:16:49Z

That's the plan @jreback I wanted to focus on fixturizing the tests before refactoring.

Seeing how this PR went, it apparently makes more sense to refactor right away and go test by test instead ✌️

started to use fixtures in TestIndexOps tests

913441b

jbrockmendel reviewed Feb 17, 2020

View reviewed changes

pandas/conftest.py Outdated Show resolved Hide resolved

SaturnFromTitan added 3 commits February 17, 2020 21:40

updated docstring if indices fixture

1deb305

Merge branch 'master' into fixturize-test-base

d609b20

Merge branch 'master' into fixturize-test-base

20b1c2a

SaturnFromTitan requested a review from jreback February 18, 2020 22:50

WillAyd reviewed Feb 19, 2020

View reviewed changes

pandas/tests/base/test_ops.py Outdated Show resolved Hide resolved

pandas/tests/base/test_ops.py Outdated Show resolved Hide resolved

pandas/tests/base/test_ops.py Outdated Show resolved Hide resolved

WillAyd added the Testing pandas testing functions or related to the test suite label Feb 19, 2020

WillAyd reviewed Feb 20, 2020

View reviewed changes