Skip to content

Commit 63f6b4b

Browse files
committed
Make *_range functions consistent
1 parent fdbc6b8 commit 63f6b4b

File tree

10 files changed

+215
-54
lines changed

10 files changed

+215
-54
lines changed

doc/source/whatsnew/v0.21.0.txt

+53
Original file line numberDiff line numberDiff line change
@@ -310,6 +310,59 @@ Previously, :func:`to_datetime` did not localize datetime ``Series`` data when `
310310

311311
Additionally, DataFrames with datetime columns that were parsed by :func:`read_sql_table` and :func:`read_sql_query` will also be localized to UTC only if the original SQL columns were timezone aware datetime columns.
312312

313+
.. _whatsnew_0200.api.consistency_of_range_functions:
314+
315+
Consistency of Range Functions
316+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
317+
318+
In previous versions, there were some inconsistencies between the various range functions: ``date_range``, ``bdate_range``, ``cdate_range``, ``interval_range``, ``period_range``, and ``timedelta_range``. (:issue:`17471`).
319+
320+
One of the inconsistent behaviors occurred when the ``start``, ``end`` and ``period`` parameters were all specified, potentially leading to ambiguous ranges. When all three parameters were passed, ``interval_range`` ignored the ``period`` parameter, ``period_range`` ignored the ``end`` parameter, and the other range functions raised. To promote consistency among the range functions, and avoid potentially ambiguous ranges, ``interval_range`` and ``period_range`` will now raise when all three parameters are passed.
321+
322+
Previous Behavior:
323+
324+
.. code-block:: ipython
325+
326+
In [2]: pd.interval_range(start=0, end=4, periods=6)
327+
Out[2]:
328+
IntervalIndex([(0, 1], (1, 2], (2, 3]]
329+
closed='right',
330+
dtype='interval[int64]')
331+
332+
In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
333+
Out[3]: PeriodIndex(['2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1', '2018Q2'], dtype='period[Q-DEC]', freq='Q-DEC')
334+
335+
New Behavior:
336+
337+
.. code-block:: ipython
338+
339+
In [2]: pd.interval_range(start=0, end=4, periods=6)
340+
---------------------------------------------------------------------------
341+
ValueError: Must specify exactly two of start, end, or periods
342+
343+
In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
344+
---------------------------------------------------------------------------
345+
ValueError: Must specify exactly two of start, end, or periods
346+
347+
Additionally, the endpoint parameter ``end`` was not included in the intervals produced by ``interval_range``. However, all other range functions include ``end`` in their output. To promote consistency among the range functions, ``interval_range`` will now include ``end`` as the right endpoint of the final interval, except if ``freq`` is specified in a way which skips ``end``.
348+
349+
Previous Behavior:
350+
351+
.. code-block:: ipython
352+
353+
In [4]: pd.interval_range(start=0, end=4)
354+
Out[4]:
355+
IntervalIndex([(0, 1], (1, 2], (2, 3]]
356+
closed='right',
357+
dtype='interval[int64]')
358+
359+
360+
New Behavior:
361+
362+
.. ipython:: python
363+
364+
pd.interval_range(start=0, end=4)
365+
313366
.. _whatsnew_0210.api:
314367

315368
Other API Changes

pandas/core/indexes/datetimes.py

+5-4
Original file line numberDiff line numberDiff line change
@@ -412,7 +412,8 @@ def __new__(cls, data=None,
412412
def _generate(cls, start, end, periods, name, offset,
413413
tz=None, normalize=False, ambiguous='raise', closed=None):
414414
if com._count_not_none(start, end, periods) != 2:
415-
raise ValueError('Must specify two of start, end, or periods')
415+
msg = 'Must specify exactly two of start, end, or periods'
416+
raise ValueError(msg)
416417

417418
_normalized = True
418419

@@ -2030,7 +2031,7 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
20302031
20312032
Notes
20322033
-----
2033-
2 of start, end, or periods must be specified
2034+
Exactly two of start, end, or periods must be specified
20342035
20352036
To learn more about the frequency strings, please see `this link
20362037
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
@@ -2073,7 +2074,7 @@ def bdate_range(start=None, end=None, periods=None, freq='B', tz=None,
20732074
20742075
Notes
20752076
-----
2076-
2 of start, end, or periods must be specified
2077+
Exactly two of start, end, or periods must be specified
20772078
20782079
To learn more about the frequency strings, please see `this link
20792080
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
@@ -2127,7 +2128,7 @@ def cdate_range(start=None, end=None, periods=None, freq='C', tz=None,
21272128
21282129
Notes
21292130
-----
2130-
2 of start, end, or periods must be specified
2131+
Exactly two of start, end, or periods must be specified
21312132
21322133
To learn more about the frequency strings, please see `this link
21332134
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.

pandas/core/indexes/interval.py

+7-15
Original file line numberDiff line numberDiff line change
@@ -1039,43 +1039,35 @@ def interval_range(start=None, end=None, freq=None, periods=None,
10391039
Left bound for generating data
10401040
end : string or datetime-like, default None
10411041
Right bound for generating data
1042-
freq : interger, string or DateOffset, default 1
1043-
periods : interger, default None
1042+
freq : integer, string or DateOffset, default 1
1043+
periods : integer, default None
10441044
name : str, default None
10451045
Name of the resulting index
10461046
closed : string, default 'right'
10471047
options are: 'left', 'right', 'both', 'neither'
10481048
10491049
Notes
10501050
-----
1051-
2 of start, end, or periods must be specified
1051+
Exactly two of start, end, or periods must be specified
10521052
10531053
Returns
10541054
-------
10551055
rng : IntervalIndex
10561056
"""
1057+
if com._count_not_none(start, end, periods) != 2:
1058+
raise ValueError('Must specify exactly two of start, end, or periods')
10571059

10581060
if freq is None:
10591061
freq = 1
1060-
10611062
if start is None:
1062-
if periods is None or end is None:
1063-
raise ValueError("must specify 2 of start, end, periods")
10641063
start = end - periods * freq
10651064
if end is None:
1066-
if periods is None or start is None:
1067-
raise ValueError("must specify 2 of start, end, periods")
10681065
end = start + periods * freq
1069-
if periods is None:
1070-
if start is None or end is None:
1071-
raise ValueError("must specify 2 of start, end, periods")
1072-
pass
10731066

10741067
# must all be same units or None
10751068
arr = np.array([start, end, freq])
10761069
if is_object_dtype(arr):
10771070
raise ValueError("start, end, freq need to be the same type")
10781071

1079-
return IntervalIndex.from_breaks(np.arange(start, end, freq),
1080-
name=name,
1081-
closed=closed)
1072+
return IntervalIndex.from_breaks(np.arange(start, end + 1, freq),
1073+
name=name, closed=closed, **kwargs)

pandas/core/indexes/period.py

+9-3
Original file line numberDiff line numberDiff line change
@@ -1051,8 +1051,8 @@ def tz_localize(self, tz, infer_dst=False):
10511051

10521052

10531053
def _get_ordinal_range(start, end, periods, freq, mult=1):
1054-
if com._count_not_none(start, end, periods) < 2:
1055-
raise ValueError('Must specify 2 of start, end, periods')
1054+
if com._count_not_none(start, end, periods) != 2:
1055+
raise ValueError('Must specify exactly two of start, end, or periods')
10561056

10571057
if freq is not None:
10581058
_, mult = _gfc(freq)
@@ -1160,7 +1160,6 @@ def period_range(start=None, end=None, periods=None, freq='D', name=None):
11601160
Return a fixed frequency datetime index, with day (calendar) as the default
11611161
frequency
11621162
1163-
11641163
Parameters
11651164
----------
11661165
start : starting value, period-like, optional
@@ -1172,6 +1171,13 @@ def period_range(start=None, end=None, periods=None, freq='D', name=None):
11721171
name : str, default None
11731172
Name for the resulting PeriodIndex
11741173
1174+
Notes
1175+
-----
1176+
Exactly two of start, end, or periods must be specified
1177+
1178+
To learn more about the frequency strings, please see `this link
1179+
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
1180+
11751181
Returns
11761182
-------
11771183
prng : PeriodIndex

pandas/core/indexes/timedeltas.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -234,7 +234,8 @@ def __new__(cls, data=None, unit=None,
234234
@classmethod
235235
def _generate(cls, start, end, periods, name, offset, closed=None):
236236
if com._count_not_none(start, end, periods) != 2:
237-
raise ValueError('Must specify two of start, end, or periods')
237+
msg = 'Must specify exactly two of start, end, or periods'
238+
raise ValueError(msg)
238239

239240
if start is not None:
240241
start = Timedelta(start)
@@ -985,7 +986,7 @@ def timedelta_range(start=None, end=None, periods=None, freq='D',
985986
986987
Notes
987988
-----
988-
2 of start, end, or periods must be specified.
989+
Exactly two of start, end, or periods must be specified.
989990
990991
To learn more about the frequency strings, please see `this link
991992
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.

pandas/tests/indexes/datetimes/test_date_range.py

+20-8
Original file line numberDiff line numberDiff line change
@@ -107,8 +107,8 @@ def test_date_range_ambiguous_arguments(self):
107107
start = datetime(2011, 1, 1, 5, 3, 40)
108108
end = datetime(2011, 1, 1, 8, 9, 40)
109109

110-
pytest.raises(ValueError, date_range, start, end, freq='s',
111-
periods=10)
110+
with pytest.raises(ValueError):
111+
date_range(start, end, periods=10, freq='s')
112112

113113
def test_date_range_businesshour(self):
114114
idx = DatetimeIndex(['2014-07-04 09:00', '2014-07-04 10:00',
@@ -146,14 +146,26 @@ def test_date_range_businesshour(self):
146146

147147
def test_range_misspecified(self):
148148
# GH #1095
149+
with pytest.raises(ValueError):
150+
date_range(start='1/1/2000')
151+
152+
with pytest.raises(ValueError):
153+
date_range(end='1/1/2000')
154+
155+
with pytest.raises(ValueError):
156+
date_range(periods=10)
157+
158+
with pytest.raises(ValueError):
159+
date_range(start='1/1/2000', freq='H')
160+
161+
with pytest.raises(ValueError):
162+
date_range(end='1/1/2000', freq='H')
149163

150-
pytest.raises(ValueError, date_range, '1/1/2000')
151-
pytest.raises(ValueError, date_range, end='1/1/2000')
152-
pytest.raises(ValueError, date_range, periods=10)
164+
with pytest.raises(ValueError):
165+
date_range(periods=10, freq='H')
153166

154-
pytest.raises(ValueError, date_range, '1/1/2000', freq='H')
155-
pytest.raises(ValueError, date_range, end='1/1/2000', freq='H')
156-
pytest.raises(ValueError, date_range, periods=10, freq='H')
167+
with pytest.raises(ValueError):
168+
date_range()
157169

158170
def test_compat_replace(self):
159171
# https://github.com/statsmodels/statsmodels/issues/3349

pandas/tests/indexes/period/test_construction.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -440,7 +440,7 @@ def test_constructor_error(self):
440440
with tm.assert_raises_regex(ValueError, msg):
441441
PeriodIndex(start=start, end=end_intv)
442442

443-
msg = 'Must specify 2 of start, end, periods'
443+
msg = 'Must specify exactly two of start, end, or periods'
444444
with tm.assert_raises_regex(ValueError, msg):
445445
PeriodIndex(start=start)
446446

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
import pytest
2+
import pandas.util.testing as tm
3+
from pandas import date_range, period_range, PeriodIndex
4+
5+
6+
class TestPeriodRange(object):
7+
8+
@pytest.mark.parametrize('freq', ['D', 'W', 'M', 'Q', 'A'])
9+
def test_construction(self, freq):
10+
# non-empty
11+
expected = date_range(start='2017-01-01', periods=5,
12+
freq=freq, name='foo').to_period()
13+
start, end = str(expected[0]), str(expected[-1])
14+
15+
result = period_range(start=start, end=end, freq=freq, name='foo')
16+
tm.assert_index_equal(result, expected)
17+
18+
result = period_range(start=start, periods=5, freq=freq, name='foo')
19+
tm.assert_index_equal(result, expected)
20+
21+
result = period_range(end=end, periods=5, freq=freq, name='foo')
22+
tm.assert_index_equal(result, expected)
23+
24+
# empty
25+
expected = PeriodIndex([], freq=freq, name='foo')
26+
27+
result = period_range(start=start, periods=0, freq=freq, name='foo')
28+
tm.assert_index_equal(result, expected)
29+
30+
result = period_range(end=end, periods=0, freq=freq, name='foo')
31+
tm.assert_index_equal(result, expected)
32+
33+
result = period_range(start=end, end=start, freq=freq, name='foo')
34+
tm.assert_index_equal(result, expected)
35+
36+
def test_errors(self):
37+
# not enough params
38+
with pytest.raises(ValueError):
39+
period_range(start='2017Q1')
40+
41+
with pytest.raises(ValueError):
42+
period_range(end='2017Q1')
43+
44+
with pytest.raises(ValueError):
45+
period_range(periods=5)
46+
47+
with pytest.raises(ValueError):
48+
period_range()
49+
50+
# too many params
51+
with pytest.raises(ValueError):
52+
period_range(start='2017Q1', end='2018Q1', periods=8, freq='Q')

pandas/tests/indexes/test_interval.py

+46-20
Original file line numberDiff line numberDiff line change
@@ -721,40 +721,66 @@ def test_is_non_overlapping_monotonic(self):
721721

722722
class TestIntervalRange(object):
723723

724-
def test_construction(self):
725-
result = interval_range(0, 5, name='foo', closed='both')
724+
@pytest.mark.parametrize('closed', ['left', 'right', 'neither', 'both'])
725+
def test_construction(self, closed):
726+
# combinations of start/end/periods without freq
726727
expected = IntervalIndex.from_breaks(
727-
np.arange(0, 5), name='foo', closed='both')
728+
np.arange(0, 6), name='foo', closed=closed)
729+
730+
result = interval_range(start=0, end=5, name='foo', closed=closed)
728731
tm.assert_index_equal(result, expected)
729732

730-
def test_errors(self):
733+
result = interval_range(start=0, periods=5, name='foo', closed=closed)
734+
tm.assert_index_equal(result, expected)
735+
736+
result = interval_range(end=5, periods=5, name='foo', closed=closed)
737+
tm.assert_index_equal(result, expected)
738+
739+
# combinations of start/end/periods with freq
740+
expected = IntervalIndex.from_tuples([(0, 2), (2, 4), (4, 6)],
741+
name='foo', closed=closed)
742+
743+
result = interval_range(start=0, end=6, freq=2, name='foo',
744+
closed=closed)
745+
tm.assert_index_equal(result, expected)
746+
747+
result = interval_range(start=0, periods=3, freq=2, name='foo',
748+
closed=closed)
749+
tm.assert_index_equal(result, expected)
750+
751+
result = interval_range(end=6, periods=3, freq=2, name='foo',
752+
closed=closed)
753+
tm.assert_index_equal(result, expected)
754+
755+
# output truncates early if freq causes end to be skipped.
756+
result = interval_range(start=0, end=7, freq=2, name='foo',
757+
closed=closed)
758+
tm.assert_index_equal(result, expected)
731759

760+
def test_errors(self):
732761
# not enough params
733-
def f():
734-
interval_range(0)
762+
with pytest.raises(ValueError):
763+
interval_range(start=0)
735764

736-
pytest.raises(ValueError, f)
765+
with pytest.raises(ValueError):
766+
interval_range(end=5)
737767

738-
def f():
768+
with pytest.raises(ValueError):
739769
interval_range(periods=2)
740770

741-
pytest.raises(ValueError, f)
742-
743-
def f():
771+
with pytest.raises(ValueError):
744772
interval_range()
745773

746-
pytest.raises(ValueError, f)
774+
# too many params
775+
with pytest.raises(ValueError):
776+
interval_range(start=0, end=5, periods=6)
747777

748778
# mixed units
749-
def f():
750-
interval_range(0, Timestamp('20130101'), freq=2)
751-
752-
pytest.raises(ValueError, f)
753-
754-
def f():
755-
interval_range(0, 10, freq=Timedelta('1day'))
779+
with pytest.raises(ValueError):
780+
interval_range(start=0, end=Timestamp('20130101'), freq=2)
756781

757-
pytest.raises(ValueError, f)
782+
with pytest.raises(ValueError):
783+
interval_range(start=0, end=10, freq=Timedelta('1day'))
758784

759785

760786
class TestIntervalTree(object):

0 commit comments

Comments
 (0)