Skip to content

Commit aaa6b77

Browse files
charlesdong1991saurav-chakravorty
authored andcommitted
BUG: pandas.cut should disallow overlapping IntervalIndex bins (#23999)
1 parent f161593 commit aaa6b77

File tree

3 files changed

+12
-2
lines changed

3 files changed

+12
-2
lines changed

doc/source/whatsnew/v0.24.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -1530,6 +1530,7 @@ Reshaping
15301530
- Bug in :func:`pandas.melt` when passing column names that are not present in ``DataFrame`` (:issue:`23575`)
15311531
- Bug in :meth:`DataFrame.append` with a :class:`Series` with a dateutil timezone would raise a ``TypeError`` (:issue:`23682`)
15321532
- Bug in ``Series`` construction when passing no data and ``dtype=str`` (:issue:`22477`)
1533+
- Bug in :func:`cut` with ``bins`` as an overlapping ``IntervalIndex`` where multiple bins were returned per item instead of raising a ``ValueError`` (:issue:`23980`)
15331534

15341535
.. _whatsnew_0240.bug_fixes.sparse:
15351536

pandas/core/reshape/tile.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,8 @@ def cut(x, bins, right=True, labels=None, retbins=False, precision=3,
4343
and maximum values of `x`.
4444
* sequence of scalars : Defines the bin edges allowing for non-uniform
4545
width. No extension of the range of `x` is done.
46-
* IntervalIndex : Defines the exact bins to be used.
46+
* IntervalIndex : Defines the exact bins to be used. Note that
47+
IntervalIndex for `bins` must be non-overlapping.
4748
4849
right : bool, default True
4950
Indicates whether `bins` includes the rightmost edge or not. If
@@ -217,7 +218,9 @@ def cut(x, bins, right=True, labels=None, retbins=False, precision=3,
217218
bins[-1] += adj
218219

219220
elif isinstance(bins, IntervalIndex):
220-
pass
221+
if bins.is_overlapping:
222+
raise ValueError('Overlapping IntervalIndex is not accepted.')
223+
221224
else:
222225
bins = np.asarray(bins)
223226
bins = _convert_bin_to_numeric_type(bins, dtype)

pandas/tests/reshape/test_tile.py

+6
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,12 @@ def test_bins_from_intervalindex(self):
9191
tm.assert_numpy_array_equal(result.codes,
9292
np.array([1, 1, 2], dtype='int8'))
9393

94+
def test_bins_not_overlapping_from_intervalindex(self):
95+
# verify if issue 23980 is properly solved.
96+
ii = IntervalIndex.from_tuples([(0, 10), (2, 12), (4, 14)])
97+
with pytest.raises(ValueError):
98+
cut([5, 6], bins=ii)
99+
94100
def test_bins_not_monotonic(self):
95101
data = [.2, 1.4, 2.5, 6.2, 9.7, 2.1]
96102
pytest.raises(ValueError, cut, data, [0.1, 1.5, 1, 10])

0 commit comments

Comments
 (0)