Skip to content

BUG: Fix interval constructor to handle left=right case #26894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed

BUG: Fix interval constructor to handle left=right case #26894

wants to merge 9 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jun 16, 2019

Update:

[ ] - Following #26894 (review), needs tests about creating an Intervalindex with '(x,x]', also checking that the repr doesn't blowup.

@pep8speaks
Copy link

pep8speaks commented Jun 16, 2019

Hello @pilkibun! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-06-17 14:17:31 UTC

@codecov
Copy link

codecov bot commented Jun 17, 2019

Codecov Report

Merging #26894 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26894      +/-   ##
==========================================
- Coverage   91.87%   91.86%   -0.01%     
==========================================
  Files         180      180              
  Lines       50710    50710              
==========================================
- Hits        46588    46584       -4     
- Misses       4122     4126       +4
Flag Coverage Δ
#multiple 90.45% <100%> (ø) ⬆️
#single 41.12% <0%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/reshape/tile.py 97.67% <100%> (ø) ⬆️
pandas/io/gbq.py 88.88% <0%> (-11.12%) ⬇️
pandas/core/frame.py 96.88% <0%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update adc6564...e4e8632. Read the comment docs.

@codecov
Copy link

codecov bot commented Jun 17, 2019

Codecov Report

Merging #26894 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26894      +/-   ##
==========================================
- Coverage   91.87%   91.86%   -0.01%     
==========================================
  Files         180      180              
  Lines       50710    50712       +2     
==========================================
- Hits        46588    46586       -2     
- Misses       4122     4126       +4
Flag Coverage Δ
#multiple 90.45% <100%> (ø) ⬆️
#single 41.09% <0%> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/core/reshape/tile.py 97.67% <100%> (ø) ⬆️
pandas/io/gbq.py 88.88% <0%> (-11.12%) ⬇️
pandas/core/frame.py 96.88% <0%> (-0.12%) ⬇️
pandas/core/dtypes/base.py 100% <0%> (ø) ⬆️
pandas/core/indexes/base.py 96.71% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update adc6564...b3d18ba. Read the comment docs.

@ghost
Copy link
Author

ghost commented Jun 17, 2019

Ready.

Interval
^^^^^^^^

- Construction of :class:`Interval` is restricted to numeric, :class:`Timestamp` and :class:`Timedelta` endpoints (:issue:`23013`)
- Fixed bug in :class:`Series`/:class:`DataFrame` not displaying ``NaN`` in :class:`IntervalIndex` with missing values (:issue:`25984`)
-
- :class:Interval will complain when creating point interval with an inconsistent `closed` value (:issue:`26893`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a very friendly comment, can you re-word, put double backticks around ``closed=`. This is also not clear what you are actually fixing here. What is an inconsistent 'closed' value?

Copy link
Author

@ghost ghost Jun 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm talking about (0,0] or [0,0) when the same point is both open and closed, which makes no sense. Can you suggest a better wording? closed is a kwarg.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded with explicit cases

@@ -65,3 +65,23 @@ def test_repr_matches():
a = repr(idx)
b = repr(idx.values)
assert a.replace("Index", "Array") == b


def test_point_interval():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the error conditions as a separate parmeterized test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

with pytest.raises(ValueError, match=match):
pd.Interval(0, 0, closed)

pd.Interval(0, 0, 'neither') # no exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert the result value

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert x==x???

pd.Interval(0, 0, 'neither') # no exception
pd.Interval(0, 0, 'both') # no exception

assert not pd.Interval(0, 1, "left").overlaps(pd.Interval(0, 0, "neither"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would prefer this is a paramterized test as this is pretty hard to read this way

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jreback jreback requested a review from jschendel June 17, 2019 11:30
@jreback jreback added the Interval Interval data type label Jun 17, 2019
i = IntervalIndex([Interval(v, labels[0].right, closed='right')])
#
# We cannot access labels[0].left because that
# calls Interval(left, right, closed='right')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not clear what the issue is here, can you show a full example

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment explains it. labels[0] calls Interval(left, right, closed='right') which throws if left==right. But the object its constructed only to than extract its left and right values, which are available as .left[0] and .right[0]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

Interval
^^^^^^^^

- Construction of :class:`Interval` is restricted to numeric, :class:`Timestamp` and :class:`Timedelta` endpoints (:issue:`23013`)
- Fixed bug in :class:`Series`/:class:`DataFrame` not displaying ``NaN`` in :class:`IntervalIndex` with missing values (:issue:`25984`)
-
- :class:Interval now refuses to create intervals like `[0,0)` or `(0,0]` (:issue:`26893`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • :class:`Interval`
  • This description is a bit too colloquial. Could you describe the conditions where the new ValueError is raised.

@TomAugspurger
Copy link
Contributor

See #26893 (comment). We may not be progressing with this.

Copy link
Member

@jschendel jschendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -1 on this as detailed in #26893 (comment). If my -1 is overruled, then these changes would also need to be propagated to the construction of IntervalArray and IntervalIndex. Currently you can create and operate on arrays/indexes containing an empty interval, only to get an error message when the code actually tries to materialize a scalar interval:

In [1]: import pandas as pd; pd.__version__
Out[1]: '0.25.0.dev0+757.gb3d18ba887'

In [2]: ia = pd.arrays.IntervalArray.from_arrays([1,2], [1, 2])

In [3]: ia.mid
Out[3]: Float64Index([1.0, 2.0], dtype='float64')

In [4]: ia.length
Out[4]: Int64Index([0, 0], dtype='int64')

In [5]: ia[0]
---------------------------------------------------------------------------
ValueError: both/neither sides must be closed when left == right

@ghost
Copy link
Author

ghost commented Jun 18, 2019

@jschendel, I've seen your comment, lets keep the discussion all in one place in #26893.

Absolutely, this should be fixed. I haven't looked at it yet, but it just might be possible that the exception is raised during specifically repr creation for formatting purposes, in which case this might be solved by silently converting to closed='neither' on creation.

@ghost
Copy link
Author

ghost commented Jun 19, 2019

closed in favor of #26937

@ghost ghost closed this Jun 19, 2019
@ghost ghost deleted the fix_point_interval branch June 19, 2019 03:40
@ghost
Copy link
Author

ghost commented Jun 20, 2019

While writing the new PR, I came across IntervalTree for the first time. It probably relies on intervals being orderable, right? something to keep in mind if ever reconsidering the issue of a singleton.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Interval Interval data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

interval.overlaps() mishandles empty intervals
5 participants