Skip to content

CI/TST: Flaky test test_union_different_types #46144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mroeschke opened this issue Feb 25, 2022 · 2 comments
Open

CI/TST: Flaky test test_union_different_types #46144

mroeschke opened this issue Feb 25, 2022 · 2 comments
Labels
CI Continuous Integration Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail

Comments

@mroeschke
Copy link
Member

Any idea @jbrockmendel why this may flaky with Index[Int64] and Index[bool]?

e.g. https://github.com/pandas-dev/pandas/runs/5328797564?check_suite_focus=true

=================================== FAILURES ===================================
2022-02-25T04:50:41.4443922Z ________________ test_union_different_types[bool-dtype-repeats] ________________
2022-02-25T04:50:41.4448476Z [gw1] linux -- Python 3.8.12 /usr/share/miniconda/envs/pandas-dev/bin/python
2022-02-25T04:50:41.4448693Z 
2022-02-25T04:50:41.4449020Z index_flat = Index([True, True, True, True, True, True, True, True, False, False], dtype='bool')
2022-02-25T04:50:41.4449456Z index_flat2 = Int64Index([0, 0, 1, 1, 2, 2], dtype='int64')
2022-02-25T04:50:41.4449923Z request = <FixtureRequest for <Function test_union_different_types[bool-dtype-repeats]>>
2022-02-25T04:50:41.4450345Z 
2022-02-25T04:50:41.4450509Z     def test_union_different_types(index_flat, index_flat2, request):
2022-02-25T04:50:41.4450830Z         # This test only considers combinations of indices
2022-02-25T04:50:41.4451076Z         # GH 23525
2022-02-25T04:50:41.4451270Z         idx1 = index_flat
2022-02-25T04:50:41.4451485Z         idx2 = index_flat2
2022-02-25T04:50:41.4451684Z     
2022-02-25T04:50:41.4451866Z         if (
2022-02-25T04:50:41.4452059Z             not idx1.is_unique
2022-02-25T04:50:41.4452283Z             and not idx2.is_unique
2022-02-25T04:50:41.4452542Z             and not idx2.is_monotonic_decreasing
2022-02-25T04:50:41.4452784Z             and idx1.dtype.kind == "i"
2022-02-25T04:50:41.4453032Z             and idx2.dtype.kind == "b"
2022-02-25T04:50:41.4453250Z         ) or (
2022-02-25T04:50:41.4453436Z             not idx2.is_unique
2022-02-25T04:50:41.4453654Z             and not idx1.is_unique
2022-02-25T04:50:41.4453903Z             and not idx1.is_monotonic_decreasing
2022-02-25T04:50:41.4454143Z             and idx2.dtype.kind == "i"
2022-02-25T04:50:41.4454381Z             and idx1.dtype.kind == "b"
2022-02-25T04:50:41.4454590Z         ):
2022-02-25T04:50:41.4454812Z             mark = pytest.mark.xfail(
2022-02-25T04:50:41.4455086Z                 reason="GH#44000 True==1", raises=ValueError, strict=False
2022-02-25T04:50:41.4455335Z             )
2022-02-25T04:50:41.4455557Z             request.node.add_marker(mark)
2022-02-25T04:50:41.4455760Z     
2022-02-25T04:50:41.4456010Z         common_dtype = find_common_type([idx1.dtype, idx2.dtype])
2022-02-25T04:50:41.4456252Z     
2022-02-25T04:50:41.4456419Z         warn = None
2022-02-25T04:50:41.4456646Z         if not len(idx1) or not len(idx2):
2022-02-25T04:50:41.4457239Z             pass
2022-02-25T04:50:41.4457431Z         elif (
2022-02-25T04:50:41.4457646Z             idx1.dtype.kind == "c"
2022-02-25T04:50:41.4457861Z             and (
2022-02-25T04:50:41.4458106Z                 idx2.dtype.kind not in ["i", "u", "f", "c"]
2022-02-25T04:50:41.4458377Z                 or not isinstance(idx2.dtype, np.dtype)
2022-02-25T04:50:41.4458612Z             )
2022-02-25T04:50:41.4458798Z         ) or (
2022-02-25T04:50:41.4458993Z             idx2.dtype.kind == "c"
2022-02-25T04:50:41.4459206Z             and (
2022-02-25T04:50:41.4459448Z                 idx1.dtype.kind not in ["i", "u", "f", "c"]
2022-02-25T04:50:41.4459716Z                 or not isinstance(idx1.dtype, np.dtype)
2022-02-25T04:50:41.4459947Z             )
2022-02-25T04:50:41.4460129Z         ):
2022-02-25T04:50:41.4460451Z             # complex objects non-sortable
2022-02-25T04:50:41.4460697Z             warn = RuntimeWarning
2022-02-25T04:50:41.4460902Z     
2022-02-25T04:50:41.4461155Z         any_uint64 = idx1.dtype == np.uint64 or idx2.dtype == np.uint64
2022-02-25T04:50:41.4461571Z         idx1_signed = is_signed_integer_dtype(idx1.dtype)
2022-02-25T04:50:41.4461875Z         idx2_signed = is_signed_integer_dtype(idx2.dtype)
2022-02-25T04:50:41.4462107Z     
2022-02-25T04:50:41.4462434Z         # Union with a non-unique, non-monotonic index raises error
2022-02-25T04:50:41.4462730Z         # This applies to the boolean index
2022-02-25T04:50:41.4462973Z         idx1 = idx1.sort_values()
2022-02-25T04:50:41.4463185Z         idx2 = idx2.sort_values()
2022-02-25T04:50:41.4463385Z     
2022-02-25T04:50:41.4463741Z         with tm.assert_produces_warning(warn, match="'<' not supported between"):
2022-02-25T04:50:41.4464034Z >           res1 = idx1.union(idx2)
2022-02-25T04:50:41.4464160Z 
2022-02-25T04:50:41.4464282Z pandas/tests/indexes/test_setops.py:99: 
2022-02-25T04:50:41.4464547Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2022-02-25T04:50:41.4464809Z pandas/core/indexes/base.py:3282: in union
2022-02-25T04:50:41.4465060Z     return left.union(right, sort=sort)
2022-02-25T04:50:41.4465332Z pandas/core/indexes/base.py:3292: in union
2022-02-25T04:50:41.4465591Z     result = self._union(other, sort=sort)
2022-02-25T04:50:41.4465853Z pandas/core/indexes/base.py:3342: in _union
2022-02-25T04:50:41.4466124Z     result = algos.union_with_duplicates(lvals, rvals)
2022-02-25T04:50:41.4466434Z pandas/core/algorithms.py:1850: in union_with_duplicates
2022-02-25T04:50:41.4466751Z     indexer += [i] * int(max(l_count.at[value], r_count.at[value]))
2022-02-25T04:50:41.4467026Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2022-02-25T04:50:41.4467172Z 
2022-02-25T04:50:41.4467264Z self = False    False
2022-02-25T04:50:41.4467464Z 0        False
2022-02-25T04:50:41.4467652Z dtype: bool
2022-02-25T04:50:41.4467758Z 
2022-02-25T04:50:41.4467832Z     @final
2022-02-25T04:50:41.4468031Z     def __nonzero__(self):
2022-02-25T04:50:41.4468251Z >       raise ValueError(
2022-02-25T04:50:41.4468513Z             f"The truth value of a {type(self).__name__} is ambiguous. "
2022-02-25T04:50:41.4468820Z             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
2022-02-25T04:50:41.4469054Z         )
2022-02-25T04:50:41.4469349Z E       ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
2022-02-25T04:50:41.4469577Z 
2022-02-25T04:50:41.4469689Z pandas/core/generic.py:1520: ValueError
2022-02-25T04:50:41.4470096Z ________________ test_union_different_types[repeats-bool-dtype] ________________
2022-02-25T04:50:41.4470542Z [gw1] linux -- Python 3.8.12 /usr/share/miniconda/envs/pandas-dev/bin/python
2022-02-25T04:50:41.4470741Z 
2022-02-25T04:50:41.4470932Z index_flat = Int64Index([0, 0, 1, 1, 2, 2], dtype='int64')
2022-02-25T04:50:41.4471375Z index_flat2 = Index([True, True, True, True, True, True, True, True, False, False], dtype='bool')
2022-02-25T04:50:41.4471953Z request = <FixtureRequest for <Function test_union_different_types[repeats-bool-dtype]>>
2022-02-25T04:50:41.4472187Z 
2022-02-25T04:50:41.4472342Z     def test_union_different_types(index_flat, index_flat2, request):
2022-02-25T04:50:41.4472646Z         # This test only considers combinations of indices
2022-02-25T04:50:41.4472886Z         # GH 23525
2022-02-25T04:50:41.4473090Z         idx1 = index_flat
2022-02-25T04:50:41.4473287Z         idx2 = index_flat2
2022-02-25T04:50:41.4473481Z     
2022-02-25T04:50:41.4473661Z         if (
2022-02-25T04:50:41.4473863Z             not idx1.is_unique
2022-02-25T04:50:41.4474073Z             and not idx2.is_unique
2022-02-25T04:50:41.4474328Z             and not idx2.is_monotonic_decreasing
2022-02-25T04:50:41.4474583Z             and idx1.dtype.kind == "i"
2022-02-25T04:50:41.4474810Z             and idx2.dtype.kind == "b"
2022-02-25T04:50:41.4475024Z         ) or (
2022-02-25T04:50:41.4475229Z             not idx2.is_unique
2022-02-25T04:50:41.4475435Z             and not idx1.is_unique
2022-02-25T04:50:41.4475795Z             and not idx1.is_monotonic_decreasing
2022-02-25T04:50:41.4476050Z             and idx2.dtype.kind == "i"
2022-02-25T04:50:41.4476293Z             and idx1.dtype.kind == "b"
2022-02-25T04:50:41.4476490Z         ):
2022-02-25T04:50:41.4476706Z             mark = pytest.mark.xfail(
2022-02-25T04:50:41.4477003Z                 reason="GH#44000 True==1", raises=ValueError, strict=False
2022-02-25T04:50:41.4477234Z             )
2022-02-25T04:50:41.4477461Z             request.node.add_marker(mark)
2022-02-25T04:50:41.4477679Z     
2022-02-25T04:50:41.4477919Z         common_dtype = find_common_type([idx1.dtype, idx2.dtype])
2022-02-25T04:50:41.4478167Z     
2022-02-25T04:50:41.4478355Z         warn = None
2022-02-25T04:50:41.4478569Z         if not len(idx1) or not len(idx2):
2022-02-25T04:50:41.4478787Z             pass
2022-02-25T04:50:41.4478977Z         elif (
2022-02-25T04:50:41.4479185Z             idx1.dtype.kind == "c"
2022-02-25T04:50:41.4479382Z             and (
2022-02-25T04:50:41.4479628Z                 idx2.dtype.kind not in ["i", "u", "f", "c"]
2022-02-25T04:50:41.4479913Z                 or not isinstance(idx2.dtype, np.dtype)
2022-02-25T04:50:41.4480131Z             )
2022-02-25T04:50:41.4480318Z         ) or (
2022-02-25T04:50:41.4480526Z             idx2.dtype.kind == "c"
2022-02-25T04:50:41.4480723Z             and (
2022-02-25T04:50:41.4480965Z                 idx1.dtype.kind not in ["i", "u", "f", "c"]
2022-02-25T04:50:41.4481248Z                 or not isinstance(idx1.dtype, np.dtype)
2022-02-25T04:50:41.4481463Z             )
2022-02-25T04:50:41.4481646Z         ):
2022-02-25T04:50:41.4481942Z             # complex objects non-sortable
2022-02-25T04:50:41.4482183Z             warn = RuntimeWarning
2022-02-25T04:50:41.4482370Z     
2022-02-25T04:50:41.4482623Z         any_uint64 = idx1.dtype == np.uint64 or idx2.dtype == np.uint64
2022-02-25T04:50:41.4482932Z         idx1_signed = is_signed_integer_dtype(idx1.dtype)
2022-02-25T04:50:41.4483220Z         idx2_signed = is_signed_integer_dtype(idx2.dtype)
2022-02-25T04:50:41.4483459Z     
2022-02-25T04:50:41.4483792Z         # Union with a non-unique, non-monotonic index raises error
2022-02-25T04:50:41.4484074Z         # This applies to the boolean index
2022-02-25T04:50:41.4484316Z         idx1 = idx1.sort_values()
2022-02-25T04:50:41.4484544Z         idx2 = idx2.sort_values()
2022-02-25T04:50:41.4484741Z     
2022-02-25T04:50:41.4485086Z         with tm.assert_produces_warning(warn, match="'<' not supported between"):
2022-02-25T04:50:41.4485377Z >           res1 = idx1.union(idx2)
2022-02-25T04:50:41.4485515Z 
2022-02-25T04:50:41.4485634Z pandas/tests/indexes/test_setops.py:99: 
2022-02-25T04:50:41.4485884Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2022-02-25T04:50:41.4486144Z pandas/core/indexes/base.py:3282: in union
2022-02-25T04:50:41.4486403Z     return left.union(right, sort=sort)
2022-02-25T04:50:41.4486663Z pandas/core/indexes/base.py:3292: in union
2022-02-25T04:50:41.4486987Z     result = self._union(other, sort=sort)
2022-02-25T04:50:41.4487257Z pandas/core/indexes/base.py:3342: in _union
2022-02-25T04:50:41.4487541Z     result = algos.union_with_duplicates(lvals, rvals)
2022-02-25T04:50:41.4487840Z pandas/core/algorithms.py:1850: in union_with_duplicates
2022-02-25T04:50:41.4488152Z     indexer += [i] * int(max(l_count.at[value], r_count.at[value]))
2022-02-25T04:50:41.4488435Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2022-02-25T04:50:41.4488580Z 
2022-02-25T04:50:41.4488672Z self = 0        False
2022-02-25T04:50:41.4488867Z False    False
2022-02-25T04:50:41.4489082Z dtype: bool
2022-02-25T04:50:41.4489210Z 
2022-02-25T04:50:41.4489290Z     @final
2022-02-25T04:50:41.4489482Z     def __nonzero__(self):
2022-02-25T04:50:41.4489715Z >       raise ValueError(
2022-02-25T04:50:41.4490003Z             f"The truth value of a {type(self).__name__} is ambiguous. "
2022-02-25T04:50:41.4490429Z             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
2022-02-25T04:50:41.4490774Z         )
2022-02-25T04:50:41.4491100Z E       ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
@mroeschke mroeschke added Testing pandas testing functions or related to the test suite CI Continuous Integration Unreliable Test Unit tests that occasionally fail labels Feb 25, 2022
@jbrockmendel
Copy link
Member

At one point I had a xfail(strict=False) for this, but must have removed it prematurely.

@mroeschke
Copy link
Member Author

Looks like the strict=False is still there but the if condition to get there was too strict.

When #44000 is solved this issue can be closed too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail
Projects
None yet
Development

No branches or pull requests

2 participants