-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Make *_range functions consistent #17482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0d93550
c1735ea
93c7005
8ff33da
8b73df9
37c24bc
f6cc860
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -218,7 +218,7 @@ Furthermore this will now correctly box the results of iteration for :func:`Data | |
.. ipython:: ipython | ||
|
||
d = {'a':[1], 'b':['b']} | ||
df = pd,DataFrame(d) | ||
df = pd.DataFrame(d) | ||
|
||
Previously: | ||
|
||
|
@@ -358,6 +358,59 @@ Previously, :func:`to_datetime` did not localize datetime ``Series`` data when ` | |
|
||
Additionally, DataFrames with datetime columns that were parsed by :func:`read_sql_table` and :func:`read_sql_query` will also be localized to UTC only if the original SQL columns were timezone aware datetime columns. | ||
|
||
.. _whatsnew_0210.api.consistency_of_range_functions: | ||
|
||
Consistency of Range Functions | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
In previous versions, there were some inconsistencies between the various range functions: func:`date_range`, func:`bdate_range`, func:`cdate_range`, func:`period_range`, func:`timedelta_range`, and func:`interval_range`. (:issue:`17471`). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are missing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (and also cdate_range here does not work) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, see #17554 |
||
|
||
One of the inconsistent behaviors occurred when the ``start``, ``end`` and ``period`` parameters were all specified, potentially leading to ambiguous ranges. When all three parameters were passed, ``interval_range`` ignored the ``period`` parameter, ``period_range`` ignored the ``end`` parameter, and the other range functions raised. To promote consistency among the range functions, and avoid potentially ambiguous ranges, ``interval_range`` and ``period_range`` will now raise when all three parameters are passed. | ||
|
||
Previous Behavior: | ||
|
||
.. code-block:: ipython | ||
|
||
In [2]: pd.interval_range(start=0, end=4, periods=6) | ||
Out[2]: | ||
IntervalIndex([(0, 1], (1, 2], (2, 3]] | ||
closed='right', | ||
dtype='interval[int64]') | ||
|
||
In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q') | ||
Out[3]: PeriodIndex(['2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1', '2018Q2'], dtype='period[Q-DEC]', freq='Q-DEC') | ||
|
||
New Behavior: | ||
|
||
.. code-block:: ipython | ||
|
||
In [2]: pd.interval_range(start=0, end=4, periods=6) | ||
--------------------------------------------------------------------------- | ||
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified | ||
|
||
In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q') | ||
--------------------------------------------------------------------------- | ||
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified | ||
|
||
Additionally, the endpoint parameter ``end`` was not included in the intervals produced by ``interval_range``. However, all other range functions include ``end`` in their output. To promote consistency among the range functions, ``interval_range`` will now include ``end`` as the right endpoint of the final interval, except if ``freq`` is specified in a way which skips ``end``. | ||
|
||
Previous Behavior: | ||
|
||
.. code-block:: ipython | ||
|
||
In [4]: pd.interval_range(start=0, end=4) | ||
Out[4]: | ||
IntervalIndex([(0, 1], (1, 2], (2, 3]] | ||
closed='right', | ||
dtype='interval[int64]') | ||
|
||
|
||
New Behavior: | ||
|
||
.. ipython:: python | ||
|
||
pd.interval_range(start=0, end=4) | ||
|
||
.. _whatsnew_0210.api: | ||
|
||
Other API Changes | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -292,8 +292,8 @@ def __new__(cls, data=None, | |
if is_float(periods): | ||
periods = int(periods) | ||
elif not is_integer(periods): | ||
raise ValueError('Periods must be a number, got %s' % | ||
str(periods)) | ||
msg = 'periods must be a number, got {periods}' | ||
raise TypeError(msg.format(periods=periods)) | ||
|
||
if data is None and freq is None: | ||
raise ValueError("Must provide freq argument if no data is " | ||
|
@@ -412,7 +412,8 @@ def __new__(cls, data=None, | |
def _generate(cls, start, end, periods, name, offset, | ||
tz=None, normalize=False, ambiguous='raise', closed=None): | ||
if com._count_not_none(start, end, periods) != 2: | ||
raise ValueError('Must specify two of start, end, or periods') | ||
raise ValueError('Of the three parameters: start, end, and ' | ||
'periods, exactly two must be specified') | ||
|
||
_normalized = True | ||
|
||
|
@@ -2004,7 +2005,7 @@ def _generate_regular_range(start, end, periods, offset): | |
def date_range(start=None, end=None, periods=None, freq='D', tz=None, | ||
normalize=False, name=None, closed=None, **kwargs): | ||
""" | ||
Return a fixed frequency datetime index, with day (calendar) as the default | ||
Return a fixed frequency DatetimeIndex, with day (calendar) as the default | ||
frequency | ||
Parameters | ||
|
@@ -2013,24 +2014,25 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None, | |
Left bound for generating dates | ||
end : string or datetime-like, default None | ||
Right bound for generating dates | ||
periods : integer or None, default None | ||
If None, must specify start and end | ||
periods : integer, default None | ||
Number of periods to generate | ||
freq : string or DateOffset, default 'D' (calendar daily) | ||
Frequency strings can have multiples, e.g. '5H' | ||
tz : string or None | ||
tz : string, default None | ||
Time zone name for returning localized DatetimeIndex, for example | ||
Asia/Hong_Kong | ||
normalize : bool, default False | ||
Normalize start/end dates to midnight before generating date range | ||
name : str, default None | ||
Name of the resulting index | ||
closed : string or None, default None | ||
name : string, default None | ||
Name of the resulting DatetimeIndex | ||
closed : string, default None | ||
Make the interval closed with respect to the given frequency to | ||
the 'left', 'right', or both sides (None) | ||
Notes | ||
----- | ||
2 of start, end, or periods must be specified | ||
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two | ||
must be specified. | ||
To learn more about the frequency strings, please see `this link | ||
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__. | ||
|
@@ -2047,7 +2049,7 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None, | |
def bdate_range(start=None, end=None, periods=None, freq='B', tz=None, | ||
normalize=True, name=None, closed=None, **kwargs): | ||
""" | ||
Return a fixed frequency datetime index, with business day as the default | ||
Return a fixed frequency DatetimeIndex, with business day as the default | ||
frequency | ||
Parameters | ||
|
@@ -2056,24 +2058,25 @@ def bdate_range(start=None, end=None, periods=None, freq='B', tz=None, | |
Left bound for generating dates | ||
end : string or datetime-like, default None | ||
Right bound for generating dates | ||
periods : integer or None, default None | ||
If None, must specify start and end | ||
periods : integer, default None | ||
Number of periods to generate | ||
freq : string or DateOffset, default 'B' (business daily) | ||
Frequency strings can have multiples, e.g. '5H' | ||
tz : string or None | ||
Time zone name for returning localized DatetimeIndex, for example | ||
Asia/Beijing | ||
normalize : bool, default False | ||
Normalize start/end dates to midnight before generating date range | ||
name : str, default None | ||
Name for the resulting index | ||
closed : string or None, default None | ||
name : string, default None | ||
Name of the resulting DatetimeIndex | ||
closed : string, default None | ||
Make the interval closed with respect to the given frequency to | ||
the 'left', 'right', or both sides (None) | ||
Notes | ||
----- | ||
2 of start, end, or periods must be specified | ||
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two | ||
must be specified. | ||
To learn more about the frequency strings, please see `this link | ||
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__. | ||
|
@@ -2091,7 +2094,7 @@ def bdate_range(start=None, end=None, periods=None, freq='B', tz=None, | |
def cdate_range(start=None, end=None, periods=None, freq='C', tz=None, | ||
normalize=True, name=None, closed=None, **kwargs): | ||
""" | ||
**EXPERIMENTAL** Return a fixed frequency datetime index, with | ||
**EXPERIMENTAL** Return a fixed frequency DatetimeIndex, with | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. side issue we can prob remove the experimental (but new PR for that) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did this in #17554, as there were other doc related fixes that needed to be made. |
||
CustomBusinessDay as the default frequency | ||
.. warning:: EXPERIMENTAL | ||
|
@@ -2105,29 +2108,30 @@ def cdate_range(start=None, end=None, periods=None, freq='C', tz=None, | |
Left bound for generating dates | ||
end : string or datetime-like, default None | ||
Right bound for generating dates | ||
periods : integer or None, default None | ||
If None, must specify start and end | ||
periods : integer, default None | ||
Number of periods to generate | ||
freq : string or DateOffset, default 'C' (CustomBusinessDay) | ||
Frequency strings can have multiples, e.g. '5H' | ||
tz : string or None | ||
tz : string, default None | ||
Time zone name for returning localized DatetimeIndex, for example | ||
Asia/Beijing | ||
normalize : bool, default False | ||
Normalize start/end dates to midnight before generating date range | ||
name : str, default None | ||
Name for the resulting index | ||
weekmask : str, Default 'Mon Tue Wed Thu Fri' | ||
name : string, default None | ||
Name of the resulting DatetimeIndex | ||
weekmask : string, Default 'Mon Tue Wed Thu Fri' | ||
weekmask of valid business days, passed to ``numpy.busdaycalendar`` | ||
holidays : list | ||
list/array of dates to exclude from the set of valid business days, | ||
passed to ``numpy.busdaycalendar`` | ||
closed : string or None, default None | ||
closed : string, default None | ||
Make the interval closed with respect to the given frequency to | ||
the 'left', 'right', or both sides (None) | ||
Notes | ||
----- | ||
2 of start, end, or periods must be specified | ||
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two | ||
must be specified. | ||
To learn more about the frequency strings, please see `this link | ||
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one is not in the public API, so adding it here does not work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, see #17554