Skip to content

Commit bb7cfef

Browse files
committed
Merge branch 'master' of github.com:pandas-dev/pandas into feature/pivot_table_groupby_observed
2 parents a3bcf1a + 947bd76 commit bb7cfef

File tree

23 files changed

+134
-113
lines changed

23 files changed

+134
-113
lines changed

asv_bench/benchmarks/index_object.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import numpy as np
22
import pandas.util.testing as tm
33
from pandas import (Series, date_range, DatetimeIndex, Index, RangeIndex,
4-
Float64Index)
4+
Float64Index, IntervalIndex)
55

66

77
class SetOperations:
@@ -181,4 +181,18 @@ def time_get_loc(self):
181181
self.ind.get_loc(0)
182182

183183

184+
class IntervalIndexMethod(object):
185+
# GH 24813
186+
params = [10**3, 10**5]
187+
188+
def setup(self, N):
189+
left = np.append(np.arange(N), np.array(0))
190+
right = np.append(np.arange(1, N + 1), np.array(1))
191+
self.intv = IntervalIndex.from_arrays(left, right)
192+
self.intv._engine
193+
194+
def time_monotonic_inc(self, N):
195+
self.intv.is_monotonic_increasing
196+
197+
184198
from .pandas_vb_common import setup # noqa: F401

doc/source/reference/groupby.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,10 @@ Computations / Descriptive Stats
5050
GroupBy.bfill
5151
GroupBy.count
5252
GroupBy.cumcount
53+
GroupBy.cummax
54+
GroupBy.cummin
55+
GroupBy.cumprod
56+
GroupBy.cumsum
5357
GroupBy.ffill
5458
GroupBy.first
5559
GroupBy.head

doc/source/whatsnew/v0.25.0.rst

Lines changed: 30 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ including other versions of pandas.
2424
Other Enhancements
2525
^^^^^^^^^^^^^^^^^^
2626
- :func:`DataFrame.plot` keywords ``logy``, ``logx`` and ``loglog`` can now accept the value ``'sym'`` for symlog scaling. (:issue:`24867`)
27-
- Added support for ISO week year format ('%G-%V-%u') when parsing datetimes using :meth: `to_datetime` (:issue:`16607`)
27+
- Added support for ISO week year format ('%G-%V-%u') when parsing datetimes using :meth:`to_datetime` (:issue:`16607`)
2828
- Indexing of ``DataFrame`` and ``Series`` now accepts zerodim ``np.ndarray`` (:issue:`24919`)
2929
- :meth:`Timestamp.replace` now supports the ``fold`` argument to disambiguate DST transition times (:issue:`25017`)
3030
- :meth:`DataFrame.at_time` and :meth:`Series.at_time` now support :meth:`datetime.time` objects with timezones (:issue:`24043`)
@@ -53,16 +53,14 @@ Indexing a :class:`DataFrame` or :class:`Series` with a :class:`DatetimeIndex` w
5353
date string with a UTC offset would previously ignore the UTC offset. Now, the UTC offset
5454
is respected in indexing. (:issue:`24076`, :issue:`16785`)
5555

56-
*Previous Behavior*:
56+
.. ipython:: python
5757
58-
.. code-block:: ipython
58+
df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific'))
59+
df
5960
60-
In [1]: df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific'))
61+
*Previous Behavior*:
6162

62-
In [2]: df
63-
Out[2]:
64-
0
65-
2019-01-01 00:00:00-08:00 0
63+
.. code-block:: ipython
6664
6765
In [3]: df['2019-01-01 00:00:00+04:00':'2019-01-01 01:00:00+04:00']
6866
Out[3]:
@@ -71,9 +69,8 @@ is respected in indexing. (:issue:`24076`, :issue:`16785`)
7169
7270
*New Behavior*:
7371

74-
.. ipython:: ipython
72+
.. ipython:: python
7573
76-
df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific'))
7774
df['2019-01-01 12:00:00+04:00':'2019-01-01 13:00:00+04:00']
7875
7976
.. _whatsnew_0250.api_breaking.groupby_apply_first_group_once:
@@ -84,10 +81,7 @@ GroupBy.apply on ``DataFrame`` evaluates first group only once
8481
The implementation of :meth:`DataFrameGroupBy.apply() <pandas.core.groupby.DataFrameGroupBy.apply>`
8582
previously evaluated the supplied function consistently twice on the first group
8683
to infer if it is safe to use a fast code path. Particularly for functions with
87-
side effects, this was an undesired behavior and may have led to surprises.
88-
89-
(:issue:`2936`, :issue:`2656`, :issue:`7739`, :issue:`10519`, :issue:`12155`,
90-
:issue:`20084`, :issue:`21417`)
84+
side effects, this was an undesired behavior and may have led to surprises. (:issue:`2936`, :issue:`2656`, :issue:`7739`, :issue:`10519`, :issue:`12155`, :issue:`20084`, :issue:`21417`)
9185

9286
Now every group is evaluated only a single time.
9387

@@ -124,7 +118,7 @@ Concatenating Sparse Values
124118
^^^^^^^^^^^^^^^^^^^^^^^^^^^
125119

126120
When passed DataFrames whose values are sparse, :func:`concat` will now return a
127-
Series or DataFrame with sparse values, rather than a ``SparseDataFrame`` (:issue:`25702`).
121+
:class:`Series` or :class:`DataFrame` with sparse values, rather than a :class:`SparseDataFrame` (:issue:`25702`).
128122

129123
.. ipython:: python
130124
@@ -161,7 +155,7 @@ cause a ``SparseSeries`` or ``SparseDataFrame`` to be returned, as before.
161155
Increased minimum versions for dependencies
162156
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
163157

164-
Due to dropping support for Python 2.7, a number of optional dependencies have updated minimum versions (issue:`25725`, :issue:`24942`, :issue:`25752`).
158+
Due to dropping support for Python 2.7, a number of optional dependencies have updated minimum versions (:issue:`25725`, :issue:`24942`, :issue:`25752`).
165159
Independently, some minimum supported versions of dependencies were updated (:issue:`23519`, :issue:`25554`).
166160
If installed, we now require:
167161

@@ -215,7 +209,7 @@ Other API Changes
215209
^^^^^^^^^^^^^^^^^
216210

217211
- :class:`DatetimeTZDtype` will now standardize pytz timezones to a common timezone instance (:issue:`24713`)
218-
- ``Timestamp`` and ``Timedelta`` scalars now implement the :meth:`to_numpy` method as aliases to :meth:`Timestamp.to_datetime64` and :meth:`Timedelta.to_timedelta64`, respectively. (:issue:`24653`)
212+
- :class:`Timestamp` and :class:`Timedelta` scalars now implement the :meth:`to_numpy` method as aliases to :meth:`Timestamp.to_datetime64` and :meth:`Timedelta.to_timedelta64`, respectively. (:issue:`24653`)
219213
- :meth:`Timestamp.strptime` will now rise a ``NotImplementedError`` (:issue:`25016`)
220214
- Comparing :class:`Timestamp` with unsupported objects now returns :py:obj:`NotImplemented` instead of raising ``TypeError``. This implies that unsupported rich comparisons are delegated to the other object, and are now consistent with Python 3 behavior for ``datetime`` objects (:issue:`24011`)
221215
- Bug in :meth:`DatetimeIndex.snap` which didn't preserving the ``name`` of the input :class:`Index` (:issue:`25575`)
@@ -226,14 +220,14 @@ Other API Changes
226220
Deprecations
227221
~~~~~~~~~~~~
228222

229-
- Deprecated the `M (months)` and `Y (year)` `units` parameter of :func: `pandas.to_timedelta`, :func: `pandas.Timedelta` and :func: `pandas.TimedeltaIndex` (:issue:`16344`)
230-
- The functions :func:`pandas.to_datetime` and :func:`pandas.to_timedelta` have deprecated the ``box`` keyword. Instead, use :meth:`to_numpy` or :meth:`Timestamp.to_datetime64`/:meth:`Timedelta.to_timedelta64`. (:issue:`24416`)
223+
- Deprecated the ``units=M`` (months) and ``units=Y`` (year) parameters for ``units`` of :func:`pandas.to_timedelta`, :func:`pandas.Timedelta` and :func:`pandas.TimedeltaIndex` (:issue:`16344`)
224+
- The functions :func:`pandas.to_datetime` and :func:`pandas.to_timedelta` have deprecated the ``box`` keyword. Instead, use :meth:`to_numpy` or :meth:`Timestamp.to_datetime64` or :meth:`Timedelta.to_timedelta64`. (:issue:`24416`)
231225

232226
.. _whatsnew_0250.prior_deprecations:
233227

234228
Removal of prior version deprecations/changes
235229
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
236-
- Removed (parts of) :class:`Panel` (:issue:`25047`,:issue:`25191`,:issue:`25231`)
230+
- Removed ``Panel`` (:issue:`25047`, :issue:`25191`, :issue:`25231`)
237231
-
238232
-
239233
-
@@ -243,15 +237,16 @@ Removal of prior version deprecations/changes
243237
Performance Improvements
244238
~~~~~~~~~~~~~~~~~~~~~~~~
245239

246-
- Significant speedup in `SparseArray` initialization that benefits most operations, fixing performance regression introduced in v0.20.0 (:issue:`24985`)
247-
- `DataFrame.to_stata()` is now faster when outputting data with any string or non-native endian columns (:issue:`25045`)
240+
- Significant speedup in :class:`SparseArray` initialization that benefits most operations, fixing performance regression introduced in v0.20.0 (:issue:`24985`)
241+
- :meth:`DataFrame.to_stata()` is now faster when outputting data with any string or non-native endian columns (:issue:`25045`)
248242
- Improved performance of :meth:`Series.searchsorted`. The speedup is especially large when the dtype is
249243
int8/int16/int32 and the searched key is within the integer bounds for the dtype (:issue:`22034`)
250244
- Improved performance of :meth:`pandas.core.groupby.GroupBy.quantile` (:issue:`20405`)
251245
- Improved performance of :meth:`read_csv` by faster tokenizing and faster parsing of small float numbers (:issue:`25784`)
252246
- Improved performance of :meth:`read_csv` by faster parsing of N/A and boolean values (:issue:`25804`)
253-
- Improved performance of :meth:DataFrame.`to_csv` when write datetime dtype data (:issue:`25708`)
254-
- Improved performance of :meth:`read_csv` by much faster parsing of MM/YYYY and DD/MM/YYYY datetime formats (:issue:`25922`)
247+
- Imporved performance of :meth:`IntervalIndex.is_monotonic`, :meth:`IntervalIndex.is_monotonic_increasing` and :meth:`IntervalIndex.is_monotonic_decreasing` by removing conversion to :class:`MultiIndex` (:issue:`24813`)
248+
- Improved performance of :meth:`DataFrame.to_csv` when writing datetime dtypes (:issue:`25708`)
249+
- Improved performance of :meth:`read_csv` by much faster parsing of ``MM/YYYY`` and ``DD/MM/YYYY`` datetime formats (:issue:`25922`)
255250

256251
.. _whatsnew_0250.bug_fixes:
257252

@@ -271,8 +266,10 @@ Datetimelike
271266
^^^^^^^^^^^^
272267

273268
- Bug in :func:`to_datetime` which would raise an (incorrect) ``ValueError`` when called with a date far into the future and the ``format`` argument specified instead of raising ``OutOfBoundsDatetime`` (:issue:`23830`)
274-
- Bug in :func:`to_datetime` which would raise ``InvalidIndexError: Reindexing only valid with uniquely valued Index objects`` when called with ``cache=True``, with ``arg`` including at least two different elements from the set {None, numpy.nan, pandas.NaT} (:issue:`22305`)
275-
-
269+
- Bug in :func:`to_datetime` which would raise ``InvalidIndexError: Reindexing only valid with uniquely valued Index objects`` when called with ``cache=True``, with ``arg`` including at least two different elements from the set ``{None, numpy.nan, pandas.NaT}`` (:issue:`22305`)
270+
- Bug in :class:`DataFrame` and :class:`Series` where timezone aware data with ``dtype='datetime64[ns]`` was not cast to naive (:issue:`25843`)
271+
- Improved :class:`Timestamp` type checking in various datetime functions to prevent exceptions when using a subclassed ``datetime`` (:issue:`25851`)
272+
- Bug in :class:`Series` and :class:`DataFrame` repr where ``np.datetime64('NaT')`` and ``np.timedelta64('NaT')`` with ``dtype=object`` would be represented as ``NaN`` (:issue:`25445`)
276273
-
277274

278275
Timedelta
@@ -331,21 +328,21 @@ Indexing
331328
^^^^^^^^
332329

333330
- Improved exception message when calling :meth:`DataFrame.iloc` with a list of non-numeric objects (:issue:`25753`).
334-
-
331+
- Bug in which :meth:`DataFrame.append` produced an erroneous warning indicating that a ``KeyError`` will be thrown in the future when the data to be appended contains new columns (:issue:`22252`).
335332
-
336333

337334

338335
Missing
339336
^^^^^^^
340337

341-
- Fixed misleading exception message in :meth:`Series.missing` if argument ``order`` is required, but omitted (:issue:`10633`, :issue:`24014`).
338+
- Fixed misleading exception message in :meth:`Series.interpolate` if argument ``order`` is required, but omitted (:issue:`10633`, :issue:`24014`).
342339
- Fixed class type displayed in exception message in :meth:`DataFrame.dropna` if invalid ``axis`` parameter passed (:issue:`25555`)
343340
-
344341

345342
MultiIndex
346343
^^^^^^^^^^
347344

348-
- Bug in which incorrect exception raised by :meth:`pd.Timedelta` when testing the membership of :class:`MultiIndex` (:issue:`24570`)
345+
- Bug in which incorrect exception raised by :class:`Timedelta` when testing the membership of :class:`MultiIndex` (:issue:`24570`)
349346
-
350347
-
351348

@@ -397,9 +394,9 @@ Groupby/Resample/Rolling
397394
Reshaping
398395
^^^^^^^^^
399396

400-
- Bug in :func:`pandas.merge` adds a string of ``None`` if ``None`` is assigned in suffixes instead of remain the column name as-is (:issue:`24782`).
397+
- Bug in :func:`pandas.merge` adds a string of ``None``, if ``None`` is assigned in suffixes instead of remain the column name as-is (:issue:`24782`).
401398
- Bug in :func:`merge` when merging by index name would sometimes result in an incorrectly numbered index (:issue:`24212`)
402-
- :func:`to_records` now accepts dtypes to its `column_dtypes` parameter (:issue:`24895`)
399+
- :func:`to_records` now accepts dtypes to its ``column_dtypes`` parameter (:issue:`24895`)
403400
- Bug in :func:`concat` where order of ``OrderedDict`` (and ``dict`` in Python 3.6+) is not respected, when passed in as ``objs`` argument (:issue:`21510`)
404401
- Bug in :func:`pivot_table` where columns with ``NaN`` values are dropped even if ``dropna`` argument is ``False``, when the ``aggfunc`` argument contains a ``list`` (:issue:`22159`)
405402
- Bug in :func:`concat` where the resulting ``freq`` of two :class:`DatetimeIndex` with the same ``freq`` would be dropped (:issue:`3232`).
@@ -410,16 +407,14 @@ Reshaping
410407
Sparse
411408
^^^^^^
412409

413-
- Significant speedup in `SparseArray` initialization that benefits most operations, fixing performance regression introduced in v0.20.0 (:issue:`24985`)
410+
- Significant speedup in :class:`SparseArray` initialization that benefits most operations, fixing performance regression introduced in v0.20.0 (:issue:`24985`)
414411
- Bug in :class:`SparseFrame` constructor where passing ``None`` as the data would cause ``default_fill_value`` to be ignored (:issue:`16807`)
415-
- Bug in `SparseDataFrame` when adding a column in which the length of values does not match length of index, ``AssertionError`` is raised instead of raising ``ValueError`` (:issue:`25484`)
412+
- Bug in :class:`SparseDataFrame` when adding a column in which the length of values does not match length of index, ``AssertionError`` is raised instead of raising ``ValueError`` (:issue:`25484`)
416413

417414

418415
Other
419416
^^^^^
420417

421-
- Improved :class:`Timestamp` type checking in various datetime functions to prevent exceptions when using a subclassed `datetime` (:issue:`25851`)
422-
- Bug in :class:`Series` and :class:`DataFrame` repr where ``np.datetime64('NaT')`` and ``np.timedelta64('NaT')`` with ``dtype=object`` would be represented as ``NaN`` (:issue:`25445`)
423418
-
424419
-
425420

mypy.ini

Lines changed: 0 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,9 @@ ignore_errors=True
1111
[mypy-pandas.compat.numpy.function]
1212
ignore_errors=True
1313

14-
[mypy-pandas.core.accessor]
15-
ignore_errors=True
16-
1714
[mypy-pandas.core.api]
1815
ignore_errors=True
1916

20-
[mypy-pandas.core.apply]
21-
ignore_errors=True
22-
2317
[mypy-pandas.core.arrays.array_]
2418
ignore_errors=True
2519

@@ -32,15 +26,9 @@ ignore_errors=True
3226
[mypy-pandas.core.arrays.interval]
3327
ignore_errors=True
3428

35-
[mypy-pandas.core.arrays.numpy_]
36-
ignore_errors=True
37-
3829
[mypy-pandas.core.arrays.period]
3930
ignore_errors=True
4031

41-
[mypy-pandas.core.arrays.sparse]
42-
ignore_errors=True
43-
4432
[mypy-pandas.core.arrays.timedeltas]
4533
ignore_errors=True
4634

@@ -98,9 +86,6 @@ ignore_errors=True
9886
[mypy-pandas.core.series]
9987
ignore_errors=True
10088

101-
[mypy-pandas.core.sparse.frame]
102-
ignore_errors=True
103-
10489
[mypy-pandas.core.util.hashing]
10590
ignore_errors=True
10691

pandas/_libs/intervaltree.pxi.in

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ Template for intervaltree
44
WARNING: DO NOT edit .pxi FILE directly, .pxi is generated from .pxi.in
55
"""
66

7+
from pandas._libs.algos import is_monotonic
8+
79
ctypedef fused scalar_t:
810
float64_t
911
float32_t
@@ -101,6 +103,17 @@ cdef class IntervalTree(IntervalMixin):
101103

102104
return self._is_overlapping
103105

106+
@property
107+
def is_monotonic_increasing(self):
108+
"""
109+
Return True if the IntervalTree is monotonic increasing (only equal or
110+
increasing values), else False
111+
"""
112+
values = [self.right, self.left]
113+
114+
sort_order = np.lexsort(values)
115+
return is_monotonic(sort_order, False)[0]
116+
104117
def get_loc(self, scalar_t key):
105118
"""Return all positions corresponding to intervals that overlap with
106119
the given scalar key

pandas/_libs/tslibs/c_timestamp.pyx

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,6 @@ cimport numpy as cnp
2323
from numpy cimport int64_t, int8_t
2424
cnp.import_array()
2525

26-
from dateutil.tz import tzutc
27-
2826
from cpython.datetime cimport (datetime,
2927
PyDateTime_Check, PyDelta_Check,
3028
PyDateTime_IMPORT)
@@ -38,9 +36,9 @@ from pandas._libs.tslibs.fields import get_start_end_field, get_date_name_field
3836
from pandas._libs.tslibs.nattype cimport c_NaT as NaT
3937
from pandas._libs.tslibs.np_datetime import OutOfBoundsDatetime
4038
from pandas._libs.tslibs.np_datetime cimport (
41-
reverse_ops, cmp_scalar, npy_datetimestruct, dt64_to_dtstruct)
39+
reverse_ops, cmp_scalar)
4240
from pandas._libs.tslibs.timezones cimport (
43-
get_timezone, get_utcoffset, is_utc, tz_compare)
41+
get_timezone, is_utc, tz_compare)
4442
from pandas._libs.tslibs.timezones import UTC
4543
from pandas._libs.tslibs.tzconversion cimport tz_convert_single
4644

@@ -381,5 +379,6 @@ cdef class _Timestamp(datetime):
381379

382380
def timestamp(self):
383381
"""Return POSIX timestamp as float."""
384-
# py27 compat, see GH#17329
382+
# GH 17329
383+
# Note: Naive timestamps will not match datetime.stdlib
385384
return round(self.value / 1e9, 6)

pandas/_libs/tslibs/conversion.pyx

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,23 +3,20 @@ import cython
33

44
import numpy as np
55
cimport numpy as cnp
6-
from numpy cimport uint8_t, int64_t, int32_t, intp_t, ndarray
6+
from numpy cimport int64_t, int32_t, intp_t, ndarray
77
cnp.import_array()
88

99
import pytz
10-
from dateutil.tz import tzutc
1110

1211
# stdlib datetime imports
1312
from datetime import time as datetime_time
1413
from cpython.datetime cimport (datetime, tzinfo,
1514
PyDateTime_Check, PyDate_Check,
16-
PyDateTime_IMPORT, PyDelta_Check)
15+
PyDateTime_IMPORT)
1716
PyDateTime_IMPORT
1817

1918
from pandas._libs.tslibs.c_timestamp cimport _Timestamp
2019

21-
from pandas._libs.tslibs.ccalendar import DAY_SECONDS, HOUR_SECONDS
22-
2320
from pandas._libs.tslibs.np_datetime cimport (
2421
check_dts_bounds, npy_datetimestruct, pandas_datetime_to_datetimestruct,
2522
_string_to_dts, npy_datetime, dt64_to_dtstruct, dtstruct_to_dt64,
@@ -42,7 +39,7 @@ from pandas._libs.tslibs.nattype cimport (
4239
NPY_NAT, checknull_with_nat, c_NaT as NaT)
4340

4441
from pandas._libs.tslibs.tzconversion import (
45-
tz_localize_to_utc, tz_convert, tz_convert_single)
42+
tz_localize_to_utc, tz_convert_single)
4643
from pandas._libs.tslibs.tzconversion cimport _tz_convert_tzlocal_utc
4744

4845
# ----------------------------------------------------------------------

pandas/_libs/tslibs/timedeltas.pyx

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@ import collections
33
import textwrap
44
import warnings
55

6-
import sys
7-
86
import cython
97

108
from cpython cimport Py_NE, Py_EQ, PyObject_RichCompare
@@ -14,7 +12,7 @@ cimport numpy as cnp
1412
from numpy cimport int64_t
1513
cnp.import_array()
1614

17-
from cpython.datetime cimport (datetime, timedelta,
15+
from cpython.datetime cimport (timedelta,
1816
PyDateTime_Check, PyDelta_Check,
1917
PyDateTime_IMPORT)
2018
PyDateTime_IMPORT

pandas/_libs/tslibs/timestamps.pyx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ from pandas._libs.tslibs.conversion import normalize_i8_timestamps
2222
from pandas._libs.tslibs.conversion cimport (
2323
_TSObject, convert_to_tsobject,
2424
convert_datetime_to_tsobject)
25-
from pandas._libs.tslibs.fields import get_start_end_field, get_date_name_field
2625
from pandas._libs.tslibs.nattype cimport NPY_NAT, c_NaT as NaT
2726
from pandas._libs.tslibs.np_datetime cimport (
2827
check_dts_bounds, npy_datetimestruct, dt64_to_dtstruct)

0 commit comments

Comments
 (0)