Skip to content

Commit e344d44

Browse files
committed
Merge remote-tracking branch 'upstream/master' into as_index_false
2 parents 220efdc + 83530bd commit e344d44

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+3871
-3428
lines changed

doc/source/reference/general_utility_functions.rst

+1
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Exceptions and warnings
4040
errors.EmptyDataError
4141
errors.OutOfBoundsDatetime
4242
errors.MergeError
43+
errors.NullFrequencyError
4344
errors.NumbaUtilError
4445
errors.ParserError
4546
errors.ParserWarning

doc/source/reference/offset_frequency.rst

+8
Original file line numberDiff line numberDiff line change
@@ -1044,6 +1044,7 @@ Properties
10441044
Tick.nanos
10451045
Tick.normalize
10461046
Tick.rule_code
1047+
Tick.n
10471048

10481049
Methods
10491050
~~~~~~~
@@ -1077,6 +1078,7 @@ Properties
10771078
Day.nanos
10781079
Day.normalize
10791080
Day.rule_code
1081+
Day.n
10801082

10811083
Methods
10821084
~~~~~~~
@@ -1110,6 +1112,7 @@ Properties
11101112
Hour.nanos
11111113
Hour.normalize
11121114
Hour.rule_code
1115+
Hour.n
11131116

11141117
Methods
11151118
~~~~~~~
@@ -1143,6 +1146,7 @@ Properties
11431146
Minute.nanos
11441147
Minute.normalize
11451148
Minute.rule_code
1149+
Minute.n
11461150

11471151
Methods
11481152
~~~~~~~
@@ -1176,6 +1180,7 @@ Properties
11761180
Second.nanos
11771181
Second.normalize
11781182
Second.rule_code
1183+
Second.n
11791184

11801185
Methods
11811186
~~~~~~~
@@ -1209,6 +1214,7 @@ Properties
12091214
Milli.nanos
12101215
Milli.normalize
12111216
Milli.rule_code
1217+
Milli.n
12121218

12131219
Methods
12141220
~~~~~~~
@@ -1242,6 +1248,7 @@ Properties
12421248
Micro.nanos
12431249
Micro.normalize
12441250
Micro.rule_code
1251+
Micro.n
12451252

12461253
Methods
12471254
~~~~~~~
@@ -1275,6 +1282,7 @@ Properties
12751282
Nano.nanos
12761283
Nano.normalize
12771284
Nano.rule_code
1285+
Nano.n
12781286

12791287
Methods
12801288
~~~~~~~

doc/source/user_guide/computation.rst

+18
Original file line numberDiff line numberDiff line change
@@ -648,6 +648,24 @@ from present information back to past information. This allows the rolling windo
648648
Currently, this feature is only implemented for time-based windows.
649649
For fixed windows, the closed parameter cannot be set and the rolling window will always have both endpoints closed.
650650

651+
.. _stats.iter_rolling_window:
652+
653+
Iteration over window:
654+
~~~~~~~~~~~~~~~~~~~~~~
655+
656+
.. versionadded:: 1.1.0
657+
658+
``Rolling`` and ``Expanding`` objects now support iteration. Be noted that ``min_periods`` is ignored in iteration.
659+
660+
.. ipython::
661+
662+
In [1]: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
663+
664+
In [2]: for i in df.rolling(2):
665+
...: print(i)
666+
...:
667+
668+
651669
.. _stats.moments.ts-versus-resampling:
652670

653671
Time-aware rolling vs. resampling

doc/source/whatsnew/v1.1.0.rst

+7
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,8 @@ Other enhancements
234234
compression library. Compression was also added to the low-level Stata-file writers
235235
:class:`~pandas.io.stata.StataWriter`, :class:`~pandas.io.stata.StataWriter117`,
236236
and :class:`~pandas.io.stata.StataWriterUTF8` (:issue:`26599`).
237+
- :meth:`HDFStore.put` now accepts `track_times` parameter. Parameter is passed to ``create_table`` method of ``PyTables`` (:issue:`32682`).
238+
- Make :class:`pandas.core.window.Rolling` and :class:`pandas.core.window.Expanding` iterable(:issue:`11704`)
237239

238240
.. ---------------------------------------------------------------------------
239241
@@ -641,6 +643,7 @@ Deprecations
641643

642644
- :func:`pandas.api.types.is_categorical` is deprecated and will be removed in a future version; use `:func:pandas.api.types.is_categorical_dtype` instead (:issue:`33385`)
643645
- :meth:`Index.get_value` is deprecated and will be removed in a future version (:issue:`19728`)
646+
- :meth:`DateOffset.__call__` is deprecated and will be removed in a future version, use ``offset + other`` instead (:issue:`34171`)
644647

645648
.. ---------------------------------------------------------------------------
646649
@@ -661,6 +664,8 @@ Performance improvements
661664
sparse values from ``scipy.sparse`` matrices using the
662665
:meth:`DataFrame.sparse.from_spmatrix` constructor (:issue:`32821`,
663666
:issue:`32825`, :issue:`32826`, :issue:`32856`, :issue:`32858`).
667+
- Performance improvement for groupby methods :meth:`~pandas.core.groupby.groupby.Groupby.first`
668+
and :meth:`~pandas.core.groupby.groupby.Groupby.last` (:issue:`34178`)
664669
- Performance improvement in :func:`factorize` for nullable (integer and boolean) dtypes (:issue:`33064`).
665670
- Performance improvement in reductions (sum, prod, min, max) for nullable (integer and boolean) dtypes (:issue:`30982`, :issue:`33261`, :issue:`33442`).
666671

@@ -869,6 +874,8 @@ Groupby/resample/rolling
869874
- Bug in :meth:`GroupBy.first` and :meth:`GroupBy.last` where None is not preserved in object dtype (:issue:`32800`)
870875
- Bug in :meth:`Rolling.min` and :meth:`Rolling.max`: Growing memory usage after multiple calls when using a fixed window (:issue:`30726`)
871876
- Bug in :meth:`GroupBy.agg`, :meth:`GroupBy.transform`, and :meth:`GroupBy.resample` where subclasses are not preserved (:issue:`28330`)
877+
- Bug in :meth:`GroupBy.rolling.apply` ignores args and kwargs parameters (:issue:`33433`)
878+
872879

873880
Reshaping
874881
^^^^^^^^^

pandas/_libs/groupby.pyx

+1-3
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,9 @@ cimport numpy as cnp
99
from numpy cimport (ndarray,
1010
int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t,
1111
uint32_t, uint64_t, float32_t, float64_t, complex64_t, complex128_t)
12+
from numpy.math cimport NAN
1213
cnp.import_array()
1314

14-
cdef extern from "numpy/npy_math.h":
15-
float64_t NAN "NPY_NAN"
16-
1715
from pandas._libs.util cimport numeric, get_nat
1816

1917
from pandas._libs.algos cimport (swap, TiebreakEnumType, TIEBREAK_AVERAGE,

pandas/_libs/hashtable.pyx

+2-3
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,9 @@ from libc.stdlib cimport malloc, free
88
import numpy as np
99
cimport numpy as cnp
1010
from numpy cimport ndarray, uint8_t, uint32_t, float64_t
11+
from numpy.math cimport NAN
1112
cnp.import_array()
1213

13-
cdef extern from "numpy/npy_math.h":
14-
float64_t NAN "NPY_NAN"
1514

1615
from pandas._libs.khash cimport (
1716
khiter_t,
@@ -54,7 +53,7 @@ from pandas._libs.khash cimport (
5453
)
5554

5655

57-
cimport pandas._libs.util as util
56+
from pandas._libs cimport util
5857

5958
from pandas._libs.missing cimport checknull
6059

pandas/_libs/index.pyx

+4-5
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,10 @@ from numpy cimport (
1919
cnp.import_array()
2020

2121

22-
cimport pandas._libs.util as util
22+
from pandas._libs cimport util
2323

24-
from pandas._libs.tslibs import Period, Timedelta
2524
from pandas._libs.tslibs.nattype cimport c_NaT as NaT
26-
from pandas._libs.tslibs.base cimport ABCTimestamp
25+
from pandas._libs.tslibs.base cimport ABCTimestamp, ABCTimedelta, ABCPeriod
2726

2827
from pandas._libs.hashtable cimport HashTable
2928

@@ -470,7 +469,7 @@ cdef class TimedeltaEngine(DatetimeEngine):
470469
return 'm8[ns]'
471470

472471
cdef int64_t _unbox_scalar(self, scalar) except? -1:
473-
if not (isinstance(scalar, Timedelta) or scalar is NaT):
472+
if not (isinstance(scalar, ABCTimedelta) or scalar is NaT):
474473
raise TypeError(scalar)
475474
return scalar.value
476475

@@ -480,7 +479,7 @@ cdef class PeriodEngine(Int64Engine):
480479
cdef int64_t _unbox_scalar(self, scalar) except? -1:
481480
if scalar is NaT:
482481
return scalar.value
483-
if isinstance(scalar, Period):
482+
if isinstance(scalar, ABCPeriod):
484483
# NB: we assume that we have the correct freq here.
485484
return scalar.ordinal
486485
raise TypeError(scalar)

pandas/_libs/internals.pyx

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
import cython
21
from collections import defaultdict
2+
3+
import cython
34
from cython import Py_ssize_t
45

56
from cpython.slice cimport PySlice_GetIndicesEx

pandas/_libs/interval.pyx

+5-6
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ from numpy cimport (
3333
cnp.import_array()
3434

3535

36-
cimport pandas._libs.util as util
36+
from pandas._libs cimport util
3737

3838
from pandas._libs.hashtable cimport Int64Vector
3939
from pandas._libs.tslibs.util cimport (
@@ -42,8 +42,7 @@ from pandas._libs.tslibs.util cimport (
4242
is_timedelta64_object,
4343
)
4444

45-
from pandas._libs.tslibs import Timestamp
46-
from pandas._libs.tslibs.timedeltas import Timedelta
45+
from pandas._libs.tslibs.base cimport ABCTimestamp, ABCTimedelta
4746
from pandas._libs.tslibs.timezones cimport tz_compare
4847

4948

@@ -329,7 +328,7 @@ cdef class Interval(IntervalMixin):
329328
raise ValueError(f"invalid option for 'closed': {closed}")
330329
if not left <= right:
331330
raise ValueError("left side of interval must be <= right side")
332-
if (isinstance(left, Timestamp) and
331+
if (isinstance(left, ABCTimestamp) and
333332
not tz_compare(left.tzinfo, right.tzinfo)):
334333
# GH 18538
335334
raise ValueError("left and right must have the same time zone, got "
@@ -341,7 +340,7 @@ cdef class Interval(IntervalMixin):
341340
def _validate_endpoint(self, endpoint):
342341
# GH 23013
343342
if not (is_integer_object(endpoint) or is_float_object(endpoint) or
344-
isinstance(endpoint, (Timestamp, Timedelta))):
343+
isinstance(endpoint, (ABCTimestamp, ABCTimedelta))):
345344
raise ValueError("Only numeric, Timestamp and Timedelta endpoints "
346345
"are allowed when constructing an Interval.")
347346

@@ -371,7 +370,7 @@ cdef class Interval(IntervalMixin):
371370
right = self.right
372371

373372
# TODO: need more general formatting methodology here
374-
if isinstance(left, Timestamp) and isinstance(right, Timestamp):
373+
if isinstance(left, ABCTimestamp) and isinstance(right, ABCTimestamp):
375374
left = left._short_repr
376375
right = right._short_repr
377376

pandas/_libs/lib.pyx

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
from collections import abc
22
from decimal import Decimal
3-
43
import warnings
54

65
import cython
@@ -63,7 +62,7 @@ cdef extern from "numpy/arrayobject.h":
6362
cdef extern from "src/parse_helper.h":
6463
int floatify(object, float64_t *result, int *maybe_int) except -1
6564

66-
cimport pandas._libs.util as util
65+
from pandas._libs cimport util
6766
from pandas._libs.util cimport is_nan, UINT64_MAX, INT64_MAX, INT64_MIN
6867

6968
from pandas._libs.tslib import array_to_datetime

pandas/_libs/missing.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ cimport numpy as cnp
88
from numpy cimport ndarray, int64_t, uint8_t, float64_t
99
cnp.import_array()
1010

11-
cimport pandas._libs.util as util
11+
from pandas._libs cimport util
1212

1313

1414
from pandas._libs.tslibs.np_datetime cimport get_datetime64_value, get_timedelta64_value

pandas/_libs/parsers.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ cimport numpy as cnp
3434
from numpy cimport ndarray, uint8_t, uint64_t, int64_t, float64_t
3535
cnp.import_array()
3636

37-
cimport pandas._libs.util as util
37+
from pandas._libs cimport util
3838
from pandas._libs.util cimport UINT64_MAX, INT64_MAX, INT64_MIN
3939
import pandas._libs.lib as lib
4040

pandas/_libs/reduction.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ from numpy cimport (ndarray,
1414
flatiter)
1515
cnp.import_array()
1616

17-
cimport pandas._libs.util as util
17+
from pandas._libs cimport util
1818
from pandas._libs.lib import maybe_convert_objects, is_scalar
1919

2020

@@ -603,7 +603,7 @@ cdef class BlockSlider:
603603
arr.shape[1] = 0
604604

605605

606-
def compute_reduction(arr: np.ndarray, f, axis: int = 0, dummy=None, labels=None):
606+
def compute_reduction(arr: ndarray, f, axis: int = 0, dummy=None, labels=None):
607607
"""
608608
609609
Parameters

pandas/_libs/reshape.pyx

+3-2
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,12 @@ from numpy cimport (
1515
uint64_t,
1616
)
1717

18-
cimport numpy as cnp
1918
import numpy as np
20-
from pandas._libs.lib cimport c_is_list_like
19+
cimport numpy as cnp
2120
cnp.import_array()
2221

22+
from pandas._libs.lib cimport c_is_list_like
23+
2324
ctypedef fused reshape_t:
2425
uint8_t
2526
uint16_t

pandas/_libs/testing.pyx

+6-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
11
import numpy as np
2+
from numpy cimport import_array
3+
import_array()
4+
5+
from pandas._libs.util cimport is_array
26

37
from pandas.core.dtypes.missing import isna, array_equivalent
48
from pandas.core.dtypes.common import is_dtype_equal
@@ -116,8 +120,8 @@ cpdef assert_almost_equal(a, b,
116120
assert a == b, f"{a} != {b}"
117121
return True
118122

119-
a_is_ndarray = isinstance(a, np.ndarray)
120-
b_is_ndarray = isinstance(b, np.ndarray)
123+
a_is_ndarray = is_array(a)
124+
b_is_ndarray = is_array(b)
121125

122126
if obj is None:
123127
if a_is_ndarray or b_is_ndarray:

pandas/_libs/tslibs/__init__.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@
1313
"ints_to_pytimedelta",
1414
"Timestamp",
1515
"tz_convert_single",
16-
"NullFrequencyError",
1716
]
1817

1918

@@ -22,5 +21,5 @@
2221
from .np_datetime import OutOfBoundsDatetime
2322
from .period import IncompatibleFrequency, Period
2423
from .timedeltas import Timedelta, delta_to_nanoseconds, ints_to_pytimedelta
25-
from .timestamps import NullFrequencyError, Timestamp
24+
from .timestamps import Timestamp
2625
from .tzconversion import tz_convert_single

pandas/_libs/tslibs/frequencies.pxd

-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
cpdef str get_rule_month(object source, str default=*)
22

33
cpdef get_freq_code(freqstr)
4-
cpdef object get_freq(object freq)
54
cpdef str get_base_alias(freqstr)
65
cpdef int get_to_timestamp_base(int base)
76
cpdef str get_freq_str(base, mult=*)

pandas/_libs/tslibs/frequencies.pyx

-19
Original file line numberDiff line numberDiff line change
@@ -306,25 +306,6 @@ cpdef int get_to_timestamp_base(int base):
306306
return base
307307

308308

309-
cpdef object get_freq(object freq):
310-
"""
311-
Return frequency code of given frequency str.
312-
If input is not string, return input as it is.
313-
314-
Examples
315-
--------
316-
>>> get_freq('A')
317-
1000
318-
319-
>>> get_freq('3A')
320-
1000
321-
"""
322-
if isinstance(freq, str):
323-
base, mult = get_freq_code(freq)
324-
freq = base
325-
return freq
326-
327-
328309
# ----------------------------------------------------------------------
329310
# Frequency comparison
330311

pandas/_libs/tslibs/np_datetime.pxd

-2
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,6 @@ cdef extern from "src/datetime/np_datetime.h":
5353
npy_datetimestruct *result) nogil
5454

5555

56-
cdef int reverse_ops[6]
57-
5856
cdef bint cmp_scalar(int64_t lhs, int64_t rhs, int op) except -1
5957

6058
cdef check_dts_bounds(npy_datetimestruct *dts)

0 commit comments

Comments
 (0)