Skip to content

Added RuntimeWarning to lib.fast_unique_multiple #33015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Apr 2, 2020

Conversation

ShaharNaveh
Copy link
Member

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@@ -39,6 +40,11 @@ def test_fast_unique_multiple_list_gen_sort(self):
out = lib.fast_unique_multiple_list_gen(gen, sort=False)
tm.assert_numpy_array_equal(np.array(out), expected)

def test_fast_unique_multiple_unsortable_runtimewarning(self):
idx = pd.MultiIndex.from_product([[1, pd.Timestamp("2000")], ["foo", "bar"]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you target this test more specifically at the affected function? maybe pass np.array([Timestamp(...), "foo"])?

It's not a rule/policy, but ideally the intra-pandas dependencies of test_lib would be limited to the dependencies of lib.

@ShaharNaveh
Copy link
Member Author

ShaharNaveh commented Mar 25, 2020

Looks like at L340, it triggers the warning, but only at the AP Linux py37_np_dev

def test_union_sort_other_incomparable():
# https://github.com/pandas-dev/pandas/issues/24959
idx = pd.MultiIndex.from_product([[1, pd.Timestamp("2000")], ["a", "b"]])
# default, sort=None
result = idx.union(idx[:1])
tm.assert_index_equal(result, idx)
# sort=False
result = idx.union(idx[:1], sort=False)
tm.assert_index_equal(result, idx)


@jbrockmendel Does having something like this:

from pandas.compat.numpy import _is_numpy_dev

...

if _is_numpy_dev:
    with tm.assert_produces_warning(RuntimeWarning):
        result = idx.union(idx[:1]) 
else:
    result = idx.union(idx[:1]) 

is the correct approach?

@jbrockmendel
Copy link
Member

Usually we would write

warn = RuntimeWarning if _is_numpy_dev else None
with tm.assert_produces_warning(warn):
    ...

def test_fast_unique_multiple_unsortable_runtimewarning(self):
arr = [np.array(["foo", pd.Timestamp("2000")])]
with tm.assert_produces_warning(RuntimeWarning):
lib.fast_unique_multiple(arr, sort=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test this from an end user perspective?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

@WillAyd WillAyd added the Performance Memory or execution speed performance label Mar 26, 2020
@@ -286,7 +287,7 @@ def fast_unique_multiple(list arrays, sort: bool = True):
try:
uniques.sort()
except TypeError:
# TODO: RuntimeWarning?
warnings.warn("The values in the array are unorderable", RuntimeWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you indicate somethig about what to do about this? (eg passing the sort=False keyword? I don't know if that is possible for all cases that may end up here)

@@ -332,12 +334,15 @@ def test_union_sort_other_empty_sort(slice_):
tm.assert_index_equal(result, expected)


@pytest.mark.xfail(reason="`warn` if statement is not working properly", strict=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this necesssary? non-strict xfails are among the patterns we want to avoid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because I couldn't get the CI to pass otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, lets troubleshoot that rather than xfail.

Why is there no expected warning on numpy-dev?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, I have tried in d2cd2d9 to have:

warn = None if _is_numpy_dev else RuntimeWarning

We can trouble shoot from there.

@ShaharNaveh
Copy link
Member Author

So apparently I never needed the _is_numpy_dev. *facepalm*

@jreback jreback added API - Consistency Internal Consistency of API/Behavior Error Reporting Incorrect or improved errors from pandas and removed Performance Memory or execution speed performance labels Mar 29, 2020
@jreback jreback added this to the 1.1 milestone Mar 29, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. can you add a release note, can put in Other Enhancements for 1.1

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping on green.

@@ -87,6 +87,7 @@ Other enhancements
- Positional slicing on a :class:`IntervalIndex` now supports slices with ``step > 1`` (:issue:`31658`)
- :class:`Series.str` now has a `fullmatch` method that matches a regular expression against the entire string in each row of the series, similar to `re.fullmatch` (:issue:`32806`).
- :meth:`DataFrame.sample` will now also allow array-like and BitGenerator objects to be passed to ``random_state`` as seeds (:issue:`32503`)
- :meth:`MultiIndex.union` will now raise `RuntimeWarning` if the object inside are unsortable, pass `sort=False` to suppress this warning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number at the end (this PR number is fine if we don't have an issue)

@ShaharNaveh
Copy link
Member Author

ping @jreback

@jreback jreback merged commit 92478d5 into pandas-dev:master Apr 2, 2020
@jreback
Copy link
Contributor

jreback commented Apr 2, 2020

thanks @MomIsBestFriend

@ShaharNaveh ShaharNaveh deleted the TODO-runtimewarning-_libs-lib branch April 2, 2020 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants