Skip to content

BUG: numericlike set ops on unsupported Indexes #10042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 4, 2015
Merged

Conversation

jreback
Copy link
Contributor

@jreback jreback commented May 1, 2015

closes #10038
closes #10039

  • CategoricalIndex now raise TypeError on ops like : idx - idx (we are moving all non-numeric Indexes to this, but just deprecated for now on existing index types)
  • Index ops with lists will now show the same warning (e.g. idx - ['a','b'])

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves Categorical Categorical Data Type Deprecate Functionality to remove in pandas labels May 1, 2015
@jreback jreback added this to the 0.16.1 milestone May 1, 2015
return self.difference(other)
__rsub__ = __sub__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't right -- this makes other - index equal to index.difference(other), when it really should be other.difference(index). Maybe __rsub__ should just always raise TypeError? Or I guess it could do _ensure_index(other).difference(index).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I just added __rsub__ for compat, i'll take it out (yes they should prob just always raise though)

warnings.warn("using '+' to provide set union with Indexes is deprecated, "
"use '|' or .union()",FutureWarning)
if isinstance(other, Index):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct I think, as now a list will show the warning, but not perform a set (union) operation (as a list is not an index)?

Should it not be like:

if com.is_list_like(other):
    warnings.warn(...)
    return self.union(other)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I don't really see how this is fixed? It is still

if com.is_list_like(other):
    warnings.warn(...)
if isinstance(other, Index):
    return self.union(other)

So a list will trigger the warning, but not do the union set operation.

jreback added a commit that referenced this pull request May 4, 2015
BUG: numericlike set ops on unsupported Indexes
@jreback jreback merged commit 3d769c4 into pandas-dev:master May 4, 2015
@jorisvandenbossche
Copy link
Member

There is also an inconsistency between the behaviour of __add__ and __sub__ (I see now, this was already before this PR).

def __add__(self, other):
    if is_list_like(other):
        warn ...
        return self.union(other)
    return Index(np.array(self) + other)

while

def __sub__(self, other):
    warn ...
    return self.union(other)

Shouldn't this last one also not do the list_like check (so you don't do a set operation when subtracting a scalar)?

I can do a PR if this seems OK

@jreback
Copy link
Contributor Author

jreback commented May 4, 2015

I guess technically the warning is showing up when using __sub__, but then it immediately raises anyhow as you cannot use a non-iterable (while for __add__ you can).

This is concatenation, and IS convenient

In [1]: i = Index(['foo','bar','baz'])

In [2]: i + ' 2'
Out[2]: Index([u'foo 2', u'bar 2', u'baz 2'], dtype='object')

In [3]: i - ' 2'
pandas/core/index.py:1193: FutureWarning: using '-' to provide set differences with Indexes is deprecated, use .difference()
  "use .difference()",FutureWarning)
TypeError: Input must be iterable!

If you want to change __sub__ to not show the warning (even though it raises), ok by me

@jorisvandenbossche
Copy link
Member

yes, that one as well.
But the inline comment I made was about this:

In [1]: idx = pd.Index(['a', 'b', 'c', 'd'])

In [2]: idx + ['a', 'e']
pandas\core\index.py:1184: FutureWarning: using '+' to provide set union with In
dexes is deprecated, use '|' or .union()
  "use '|' or .union()",FutureWarning)
---------------------------------------------------------------------------
...
ValueError: operands could not be broadcast together with shapes (4,) (2,)

In [3]: idx + pd.Index(['a', 'e'])
Out[3]: Index([u'a', u'b', u'c', u'd', u'e'], dtype='object')

So for the list, it displays both a warning and an error, and those two contradict each other (and I think it is also more logical to have the same behaviour as with an Index)

@jreback
Copy link
Contributor Author

jreback commented May 4, 2015

@jorisvandenbossche ahh, ok, I see. I think the example I used broadcast correctly, so yes this needs cleaning up. be my guest!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type Deprecate Functionality to remove in pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
3 participants