Skip to content

Make *_range functions consistent #17482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 14, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -218,10 +218,19 @@ Top-level dealing with datetimelike
to_timedelta
date_range
bdate_range
cdate_range
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is not in the public API, so adding it here does not work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, see #17554

period_range
timedelta_range
infer_freq

Top-level dealing with intervals
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autosummary::
:toctree: generated/

interval_range

Top-level evaluation
~~~~~~~~~~~~~~~~~~~~

Expand Down
9 changes: 9 additions & 0 deletions doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1705,6 +1705,15 @@ has multiplied span.
pd.PeriodIndex(start='2014-01', freq='3M', periods=4)
If ``start`` or ``end`` are ``Period`` objects, they will be used as anchor
endpoints for a ``PeriodIndex`` with frequency matching that of the
``PeriodIndex`` constructor.

.. ipython:: python
pd.PeriodIndex(start=pd.Period('2017Q1', freq='Q'),
end=pd.Period('2017Q2', freq='Q'), freq='M')
Just like ``DatetimeIndex``, a ``PeriodIndex`` can also be used to index pandas
objects:

Expand Down
55 changes: 54 additions & 1 deletion doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ Furthermore this will now correctly box the results of iteration for :func:`Data
.. ipython:: ipython

d = {'a':[1], 'b':['b']}
df = pd,DataFrame(d)
df = pd.DataFrame(d)

Previously:

Expand Down Expand Up @@ -358,6 +358,59 @@ Previously, :func:`to_datetime` did not localize datetime ``Series`` data when `

Additionally, DataFrames with datetime columns that were parsed by :func:`read_sql_table` and :func:`read_sql_query` will also be localized to UTC only if the original SQL columns were timezone aware datetime columns.

.. _whatsnew_0210.api.consistency_of_range_functions:

Consistency of Range Functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In previous versions, there were some inconsistencies between the various range functions: func:`date_range`, func:`bdate_range`, func:`cdate_range`, func:`period_range`, func:`timedelta_range`, and func:`interval_range`. (:issue:`17471`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are missing :'s for all references: :func:.. instead of func:..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and also cdate_range here does not work)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, see #17554


One of the inconsistent behaviors occurred when the ``start``, ``end`` and ``period`` parameters were all specified, potentially leading to ambiguous ranges. When all three parameters were passed, ``interval_range`` ignored the ``period`` parameter, ``period_range`` ignored the ``end`` parameter, and the other range functions raised. To promote consistency among the range functions, and avoid potentially ambiguous ranges, ``interval_range`` and ``period_range`` will now raise when all three parameters are passed.

Previous Behavior:

.. code-block:: ipython

In [2]: pd.interval_range(start=0, end=4, periods=6)
Out[2]:
IntervalIndex([(0, 1], (1, 2], (2, 3]]
closed='right',
dtype='interval[int64]')

In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
Out[3]: PeriodIndex(['2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1', '2018Q2'], dtype='period[Q-DEC]', freq='Q-DEC')

New Behavior:

.. code-block:: ipython

In [2]: pd.interval_range(start=0, end=4, periods=6)
---------------------------------------------------------------------------
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified

In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
---------------------------------------------------------------------------
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified

Additionally, the endpoint parameter ``end`` was not included in the intervals produced by ``interval_range``. However, all other range functions include ``end`` in their output. To promote consistency among the range functions, ``interval_range`` will now include ``end`` as the right endpoint of the final interval, except if ``freq`` is specified in a way which skips ``end``.

Previous Behavior:

.. code-block:: ipython

In [4]: pd.interval_range(start=0, end=4)
Out[4]:
IntervalIndex([(0, 1], (1, 2], (2, 3]]
closed='right',
dtype='interval[int64]')


New Behavior:

.. ipython:: python

pd.interval_range(start=0, end=4)

.. _whatsnew_0210.api:

Other API Changes
Expand Down
58 changes: 31 additions & 27 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -292,8 +292,8 @@ def __new__(cls, data=None,
if is_float(periods):
periods = int(periods)
elif not is_integer(periods):
raise ValueError('Periods must be a number, got %s' %
str(periods))
msg = 'periods must be a number, got {periods}'
raise TypeError(msg.format(periods=periods))

if data is None and freq is None:
raise ValueError("Must provide freq argument if no data is "
Expand Down Expand Up @@ -412,7 +412,8 @@ def __new__(cls, data=None,
def _generate(cls, start, end, periods, name, offset,
tz=None, normalize=False, ambiguous='raise', closed=None):
if com._count_not_none(start, end, periods) != 2:
raise ValueError('Must specify two of start, end, or periods')
raise ValueError('Of the three parameters: start, end, and '
'periods, exactly two must be specified')

_normalized = True

Expand Down Expand Up @@ -2004,7 +2005,7 @@ def _generate_regular_range(start, end, periods, offset):
def date_range(start=None, end=None, periods=None, freq='D', tz=None,
normalize=False, name=None, closed=None, **kwargs):
"""
Return a fixed frequency datetime index, with day (calendar) as the default
Return a fixed frequency DatetimeIndex, with day (calendar) as the default
frequency
Parameters
Expand All @@ -2013,24 +2014,25 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
Left bound for generating dates
end : string or datetime-like, default None
Right bound for generating dates
periods : integer or None, default None
If None, must specify start and end
periods : integer, default None
Number of periods to generate
freq : string or DateOffset, default 'D' (calendar daily)
Frequency strings can have multiples, e.g. '5H'
tz : string or None
tz : string, default None
Time zone name for returning localized DatetimeIndex, for example
Asia/Hong_Kong
normalize : bool, default False
Normalize start/end dates to midnight before generating date range
name : str, default None
Name of the resulting index
closed : string or None, default None
name : string, default None
Name of the resulting DatetimeIndex
closed : string, default None
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None)
Notes
-----
2 of start, end, or periods must be specified
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two
must be specified.
To learn more about the frequency strings, please see `this link
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
Expand All @@ -2047,7 +2049,7 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
def bdate_range(start=None, end=None, periods=None, freq='B', tz=None,
normalize=True, name=None, closed=None, **kwargs):
"""
Return a fixed frequency datetime index, with business day as the default
Return a fixed frequency DatetimeIndex, with business day as the default
frequency
Parameters
Expand All @@ -2056,24 +2058,25 @@ def bdate_range(start=None, end=None, periods=None, freq='B', tz=None,
Left bound for generating dates
end : string or datetime-like, default None
Right bound for generating dates
periods : integer or None, default None
If None, must specify start and end
periods : integer, default None
Number of periods to generate
freq : string or DateOffset, default 'B' (business daily)
Frequency strings can have multiples, e.g. '5H'
tz : string or None
Time zone name for returning localized DatetimeIndex, for example
Asia/Beijing
normalize : bool, default False
Normalize start/end dates to midnight before generating date range
name : str, default None
Name for the resulting index
closed : string or None, default None
name : string, default None
Name of the resulting DatetimeIndex
closed : string, default None
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None)
Notes
-----
2 of start, end, or periods must be specified
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two
must be specified.
To learn more about the frequency strings, please see `this link
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
Expand All @@ -2091,7 +2094,7 @@ def bdate_range(start=None, end=None, periods=None, freq='B', tz=None,
def cdate_range(start=None, end=None, periods=None, freq='C', tz=None,
normalize=True, name=None, closed=None, **kwargs):
"""
**EXPERIMENTAL** Return a fixed frequency datetime index, with
**EXPERIMENTAL** Return a fixed frequency DatetimeIndex, with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side issue we can prob remove the experimental (but new PR for that)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this in #17554, as there were other doc related fixes that needed to be made.

CustomBusinessDay as the default frequency
.. warning:: EXPERIMENTAL
Expand All @@ -2105,29 +2108,30 @@ def cdate_range(start=None, end=None, periods=None, freq='C', tz=None,
Left bound for generating dates
end : string or datetime-like, default None
Right bound for generating dates
periods : integer or None, default None
If None, must specify start and end
periods : integer, default None
Number of periods to generate
freq : string or DateOffset, default 'C' (CustomBusinessDay)
Frequency strings can have multiples, e.g. '5H'
tz : string or None
tz : string, default None
Time zone name for returning localized DatetimeIndex, for example
Asia/Beijing
normalize : bool, default False
Normalize start/end dates to midnight before generating date range
name : str, default None
Name for the resulting index
weekmask : str, Default 'Mon Tue Wed Thu Fri'
name : string, default None
Name of the resulting DatetimeIndex
weekmask : string, Default 'Mon Tue Wed Thu Fri'
weekmask of valid business days, passed to ``numpy.busdaycalendar``
holidays : list
list/array of dates to exclude from the set of valid business days,
passed to ``numpy.busdaycalendar``
closed : string or None, default None
closed : string, default None
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None)
Notes
-----
2 of start, end, or periods must be specified
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two
must be specified.
To learn more about the frequency strings, please see `this link
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
Expand Down
Loading