Skip to content

BUG: Fix interval constructor to handle left=right case #26894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -617,13 +617,12 @@ Strings
- Improved error message when passing :class:`Series` of wrong dtype to :meth:`Series.str.cat` (:issue:`22722`)
-


Interval
^^^^^^^^

- Construction of :class:`Interval` is restricted to numeric, :class:`Timestamp` and :class:`Timedelta` endpoints (:issue:`23013`)
- Fixed bug in :class:`Series`/:class:`DataFrame` not displaying ``NaN`` in :class:`IntervalIndex` with missing values (:issue:`25984`)
-
- :class:Interval will complain when creating point interval with an inconsistent `closed` value (:issue:`26893`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a very friendly comment, can you re-word, put double backticks around ``closed=`. This is also not clear what you are actually fixing here. What is an inconsistent 'closed' value?

Copy link
Author

@ghost ghost Jun 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm talking about (0,0] or [0,0) when the same point is both open and closed, which makes no sense. Can you suggest a better wording? closed is a kwarg.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded with explicit cases


Indexing
^^^^^^^^
Expand Down
3 changes: 3 additions & 0 deletions pandas/_libs/interval.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,9 @@ cdef class Interval(IntervalMixin):
raise ValueError(msg)
if not left <= right:
raise ValueError('left side of interval must be <= right side')
if left == right and closed not in ('both', 'neither'):
msg='both/neither sides must be closed when left == right'
raise ValueError(msg)
if (isinstance(left, Timestamp) and
not tz_compare(left.tzinfo, right.tzinfo)):
# GH 18538
Expand Down
11 changes: 8 additions & 3 deletions pandas/core/reshape/tile.py
Original file line number Diff line number Diff line change
Expand Up @@ -476,9 +476,14 @@ def _format_labels(bins, precision, right=True,
if right and include_lowest:
# we will adjust the left hand side by precision to
# account that we are all right closed
v = adjust(labels[0].left)

i = IntervalIndex([Interval(v, labels[0].right, closed='right')])
#
# We cannot access labels[0].left because that
# calls Interval(left, right, closed='right')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not clear what the issue is here, can you show a full example

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment explains it. labels[0] calls Interval(left, right, closed='right') which throws if left==right. But the object its constructed only to than extract its left and right values, which are available as .left[0] and .right[0]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

# and before adjusting we may have left == right
# so it will raise.
v = adjust(labels.left[0])

i = IntervalIndex([Interval(v, labels.right[0], closed='right')])
labels = i.append(labels[1:])

return labels
Expand Down
20 changes: 20 additions & 0 deletions pandas/tests/arrays/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,23 @@ def test_repr_matches():
a = repr(idx)
b = repr(idx.values)
assert a.replace("Index", "Array") == b


def test_point_interval():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the error conditions as a separate parmeterized test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

match = "both/neither sides must be closed when left == right"
for closed in ('left', 'right'):
with pytest.raises(ValueError, match=match):
pd.Interval(0, 0, closed)

pd.Interval(0, 0, 'neither') # no exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert the result value

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert x==x???

pd.Interval(0, 0, 'both') # no exception

assert not pd.Interval(0, 1, "left").overlaps(pd.Interval(0, 0, "neither"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would prefer this is a paramterized test as this is pretty hard to read this way

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

assert pd.Interval(0, 1, "left").overlaps(pd.Interval(0, 0, "both"))
assert not pd.Interval(0, 1, "right").overlaps(
pd.Interval(1, 1, "neither"))
assert pd.Interval(0, 1, "right").overlaps(pd.Interval(1, 1, "both"))
assert not pd.Interval(0, 1, "both").overlaps(pd.Interval(0, 0, "neither"))
assert not pd.Interval(0, 1, "both").overlaps(pd.Interval(0, 0, "neither"))
assert not pd.Interval(0, 1, "neither").overlaps(pd.Interval(1, 1, "both"))
assert not pd.Interval(0, 1, "neither").overlaps(pd.Interval(1, 1, "both"))
18 changes: 17 additions & 1 deletion pandas/tests/indexes/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def test_properties(self, closed):
tm.assert_numpy_array_equal(np.asarray(index), expected)

@pytest.mark.parametrize('breaks', [
[1, 1, 2, 5, 15, 53, 217, 1014, 5335, 31240, 201608],
[1, 2, 5, 15, 53, 217, 1014, 5335, 31240, 201608],
[-np.inf, -100, -10, 0.5, 1, 1.5, 3.8, 101, 202, np.inf],
pd.to_datetime(['20170101', '20170202', '20170303', '20170404']),
pd.to_timedelta(['1ns', '2ms', '3s', '4M', '5H', '6D'])])
Expand All @@ -90,6 +90,22 @@ def test_length(self, closed, breaks):
expected = Index(iv.length if notna(iv) else iv for iv in index)
tm.assert_index_equal(result, expected)

@pytest.mark.parametrize('breaks', [
[1, 1, 2, 2]])
@pytest.mark.parametrize('closed', ['neither', 'both'])
def test_length_with_point_intervals(self, closed, breaks):
# GH 18789
index = IntervalIndex.from_breaks(breaks, closed=closed)
result = index.length
expected = Index(iv.length for iv in index)
tm.assert_index_equal(result, expected)

# with NA
index = index.insert(1, np.nan)
result = index.length
expected = Index(iv.length if notna(iv) else iv for iv in index)
tm.assert_index_equal(result, expected)

def test_with_nans(self, closed):
index = self.create_index(closed=closed)
assert index.hasnans is False
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/scalar/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ def test_hash(self, interval):
(Timedelta('5S'), Timedelta('1H'), Timedelta('59M55S'))])
def test_length(self, left, right, expected):
# GH 18789
iv = Interval(left, right)
iv = Interval(left, right, 'both')
result = iv.length
assert result == expected

Expand All @@ -89,7 +89,7 @@ def test_length(self, left, right, expected):
@pytest.mark.parametrize('tz', (None, 'UTC', 'CET', 'US/Eastern'))
def test_length_timestamp(self, tz, left, right, expected):
# GH 18789
iv = Interval(Timestamp(left, tz=tz), Timestamp(right, tz=tz))
iv = Interval(Timestamp(left, tz=tz), Timestamp(right, tz=tz), 'both')
result = iv.length
expected = Timedelta(expected)
assert result == expected
Expand Down