-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix interval constructor to handle left=right case #26894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #26894 +/- ##
==========================================
- Coverage 91.87% 91.86% -0.01%
==========================================
Files 180 180
Lines 50710 50710
==========================================
- Hits 46588 46584 -4
- Misses 4122 4126 +4
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26894 +/- ##
==========================================
- Coverage 91.87% 91.86% -0.01%
==========================================
Files 180 180
Lines 50710 50712 +2
==========================================
- Hits 46588 46586 -2
- Misses 4122 4126 +4
Continue to review full report at Codecov.
|
Ready. |
doc/source/whatsnew/v0.25.0.rst
Outdated
Interval | ||
^^^^^^^^ | ||
|
||
- Construction of :class:`Interval` is restricted to numeric, :class:`Timestamp` and :class:`Timedelta` endpoints (:issue:`23013`) | ||
- Fixed bug in :class:`Series`/:class:`DataFrame` not displaying ``NaN`` in :class:`IntervalIndex` with missing values (:issue:`25984`) | ||
- | ||
- :class:Interval will complain when creating point interval with an inconsistent `closed` value (:issue:`26893`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a very friendly comment, can you re-word, put double backticks around ``closed=`. This is also not clear what you are actually fixing here. What is an inconsistent 'closed' value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about (0,0]
or [0,0)
when the same point is both open and closed, which makes no sense. Can you suggest a better wording? closed
is a kwarg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded with explicit cases
@@ -65,3 +65,23 @@ def test_repr_matches(): | |||
a = repr(idx) | |||
b = repr(idx.values) | |||
assert a.replace("Index", "Array") == b | |||
|
|||
|
|||
def test_point_interval(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the error conditions as a separate parmeterized test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
with pytest.raises(ValueError, match=match): | ||
pd.Interval(0, 0, closed) | ||
|
||
pd.Interval(0, 0, 'neither') # no exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert the result value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert x==x
???
pd.Interval(0, 0, 'neither') # no exception | ||
pd.Interval(0, 0, 'both') # no exception | ||
|
||
assert not pd.Interval(0, 1, "left").overlaps(pd.Interval(0, 0, "neither")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would prefer this is a paramterized test as this is pretty hard to read this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pandas/core/reshape/tile.py
Outdated
i = IntervalIndex([Interval(v, labels[0].right, closed='right')]) | ||
# | ||
# We cannot access labels[0].left because that | ||
# calls Interval(left, right, closed='right') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear what the issue is here, can you show a full example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment explains it. labels[0]
calls Interval(left, right, closed='right') which throws if left==right. But the object its constructed only to than extract its left and right values, which are available as .left[0]
and .right[0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
Interval | ||
^^^^^^^^ | ||
|
||
- Construction of :class:`Interval` is restricted to numeric, :class:`Timestamp` and :class:`Timedelta` endpoints (:issue:`23013`) | ||
- Fixed bug in :class:`Series`/:class:`DataFrame` not displaying ``NaN`` in :class:`IntervalIndex` with missing values (:issue:`25984`) | ||
- | ||
- :class:Interval now refuses to create intervals like `[0,0)` or `(0,0]` (:issue:`26893`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:class:`Interval`
- This description is a bit too colloquial. Could you describe the conditions where the new
ValueError
is raised.
See #26893 (comment). We may not be progressing with this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 on this as detailed in #26893 (comment). If my -1 is overruled, then these changes would also need to be propagated to the construction of IntervalArray
and IntervalIndex
. Currently you can create and operate on arrays/indexes containing an empty interval, only to get an error message when the code actually tries to materialize a scalar interval:
In [1]: import pandas as pd; pd.__version__
Out[1]: '0.25.0.dev0+757.gb3d18ba887'
In [2]: ia = pd.arrays.IntervalArray.from_arrays([1,2], [1, 2])
In [3]: ia.mid
Out[3]: Float64Index([1.0, 2.0], dtype='float64')
In [4]: ia.length
Out[4]: Int64Index([0, 0], dtype='int64')
In [5]: ia[0]
---------------------------------------------------------------------------
ValueError: both/neither sides must be closed when left == right
@jschendel, I've seen your comment, lets keep the discussion all in one place in #26893. Absolutely, this should be fixed. I haven't looked at it yet, but it just might be possible that the exception is raised during specifically repr creation for formatting purposes, in which case this might be solved by silently converting to closed='neither' on creation. |
closed in favor of #26937 |
While writing the new PR, I came across |
git diff upstream/master -u -- "*.py" | flake8 --diff
Update:
[ ] - Following #26894 (review), needs tests about creating an
Intervalindex
with '(x,x]', also checking that the repr doesn't blowup.