Skip to content

BUG: Series.is_unique has extra output if contains objects with __ne__ defined #20691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 15, 2018
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1175,3 +1175,4 @@ Other
- Improved error message when attempting to use a Python keyword as an identifier in a ``numexpr`` backed query (:issue:`18221`)
- Bug in accessing a :func:`pandas.get_option`, which raised ``KeyError`` rather than ``OptionError`` when looking up a non-existant option key in some cases (:issue:`19789`)
- Bug in :func:`assert_series_equal` and :func:`assert_frame_equal` for Series or DataFrames with differing unicode data (:issue:`20503`)
- Bug in ``Series.is_unique`` where extraneous output in stderr is shown if Series contains objects with ``__ne__`` defined (:issue:`20661`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to indexing

2 changes: 1 addition & 1 deletion pandas/_libs/hashtable_class_helper.pxi.in
Original file line number Diff line number Diff line change
Expand Up @@ -870,7 +870,7 @@ cdef class PyObjectHashTable(HashTable):
for i in range(n):
val = values[i]
hash(val)
if not _checknan(val):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this might be the only usage of checknan. can you remove it (from util.pxd) and see?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seemed to be. There was an import to remove in hashtable and the definition in util. Removing them passed tests on my laptop. Have pushed new version.

if not checknull(val):
k = kh_get_pymap(self.table, <PyObject*>val)
if k == self.table.n_buckets:
kh_put_pymap(self.table, <PyObject*>val, &ret)
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/series/test_analytics.py
Original file line number Diff line number Diff line change
Expand Up @@ -1594,6 +1594,22 @@ def test_is_unique(self):
s = Series(np.arange(1000))
assert s.is_unique

def test_is_unique_class_ne(self, capsys):
# GH 20661
class Foo(object):
def __init__(self, val):
self._value = val

def __ne__(self, other):
raise Exception("NEQ not supported")

li = [Foo(i) for i in range(5)]
s = pd.Series(li, index=[i for i in range(5)])
_, err = capsys.readouterr()
s.is_unique
_, err = capsys.readouterr()
assert len(err) == 0

def test_is_monotonic(self):

s = Series(np.random.randint(0, 10, size=1000))
Expand Down