Skip to content

Commit d14ff45

Browse files
committed
Merge branch 'master' of https://github.com/pandas-dev/pandas into add-nrows-to-read-json
"solve merge conflicts while merging to master"
2 parents cb3de4d + ea402a3 commit d14ff45

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+283
-156
lines changed

doc/source/reference/frame.rst

+9-3
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,6 @@ Conversion
4747
DataFrame.convert_dtypes
4848
DataFrame.infer_objects
4949
DataFrame.copy
50-
DataFrame.isna
51-
DataFrame.notna
5250
DataFrame.bool
5351

5452
Indexing, iteration
@@ -211,10 +209,18 @@ Missing data handling
211209
.. autosummary::
212210
:toctree: api/
213211

212+
DataFrame.backfill
213+
DataFrame.bfill
214214
DataFrame.dropna
215+
DataFrame.ffill
215216
DataFrame.fillna
216-
DataFrame.replace
217217
DataFrame.interpolate
218+
DataFrame.isna
219+
DataFrame.isnull
220+
DataFrame.notna
221+
DataFrame.notnull
222+
DataFrame.pad
223+
DataFrame.replace
218224

219225
Reshaping, sorting, transposing
220226
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

doc/source/reference/groupby.rst

+5
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ Computations / descriptive stats
5050
GroupBy.all
5151
GroupBy.any
5252
GroupBy.bfill
53+
GroupBy.backfill
5354
GroupBy.count
5455
GroupBy.cumcount
5556
GroupBy.cummax
@@ -67,6 +68,7 @@ Computations / descriptive stats
6768
GroupBy.ngroup
6869
GroupBy.nth
6970
GroupBy.ohlc
71+
GroupBy.pad
7072
GroupBy.prod
7173
GroupBy.rank
7274
GroupBy.pct_change
@@ -88,10 +90,12 @@ application to columns of a specific data type.
8890

8991
DataFrameGroupBy.all
9092
DataFrameGroupBy.any
93+
DataFrameGroupBy.backfill
9194
DataFrameGroupBy.bfill
9295
DataFrameGroupBy.corr
9396
DataFrameGroupBy.count
9497
DataFrameGroupBy.cov
98+
DataFrameGroupBy.cumcount
9599
DataFrameGroupBy.cummax
96100
DataFrameGroupBy.cummin
97101
DataFrameGroupBy.cumprod
@@ -106,6 +110,7 @@ application to columns of a specific data type.
106110
DataFrameGroupBy.idxmin
107111
DataFrameGroupBy.mad
108112
DataFrameGroupBy.nunique
113+
DataFrameGroupBy.pad
109114
DataFrameGroupBy.pct_change
110115
DataFrameGroupBy.plot
111116
DataFrameGroupBy.quantile

doc/source/reference/series.rst

+9-2
Original file line numberDiff line numberDiff line change
@@ -214,11 +214,18 @@ Missing data handling
214214
.. autosummary::
215215
:toctree: api/
216216

217-
Series.isna
218-
Series.notna
217+
Series.backfill
218+
Series.bfill
219219
Series.dropna
220+
Series.ffill
220221
Series.fillna
221222
Series.interpolate
223+
Series.isna
224+
Series.isnull
225+
Series.notna
226+
Series.notnull
227+
Series.pad
228+
Series.replace
222229

223230
Reshaping, sorting
224231
------------------

doc/source/whatsnew/v1.1.0.rst

+4
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,7 @@ Other enhancements
288288
- :meth:`HDFStore.put` now accepts `track_times` parameter. Parameter is passed to ``create_table`` method of ``PyTables`` (:issue:`32682`).
289289
- Make :class:`pandas.core.window.Rolling` and :class:`pandas.core.window.Expanding` iterable(:issue:`11704`)
290290
- Make ``option_context`` a :class:`contextlib.ContextDecorator`, which allows it to be used as a decorator over an entire function (:issue:`34253`).
291+
- :meth:`groupby.transform` now allows ``func`` to be ``pad``, ``backfill`` and ``cumcount`` (:issue:`31269`).
291292

292293
.. ---------------------------------------------------------------------------
293294
@@ -887,6 +888,7 @@ Indexing
887888
- Bug in :meth:`DataFrame.truncate` and :meth:`Series.truncate` where index was assumed to be monotone increasing (:issue:`33756`)
888889
- Indexing with a list of strings representing datetimes failed on :class:`DatetimeIndex` or :class:`PeriodIndex`(:issue:`11278`)
889890
- Bug in :meth:`Series.at` when used with a :class:`MultiIndex` would raise an exception on valid inputs (:issue:`26989`)
891+
- Bug in :meth:`Series.loc` when used with a :class:`MultiIndex` would raise an IndexingError when accessing a None value (:issue:`34318`)
890892

891893
Missing
892894
^^^^^^^
@@ -974,6 +976,7 @@ Groupby/resample/rolling
974976
to the input DataFrame is inconsistent. An internal heuristic to detect index mutation would behave differently for equal but not identical
975977
indices. In particular, the result index shape might change if a copy of the input would be returned.
976978
The behaviour now is consistent, independent of internal heuristics. (:issue:`31612`, :issue:`14927`, :issue:`13056`)
979+
- Bug in :meth:`SeriesGroupBy.agg` where any column name was accepted in the named aggregation of ``SeriesGroupBy`` previously. The behaviour now allows only ``str`` and callables else would raise ``TypeError``. (:issue:`34422`)
977980

978981
Reshaping
979982
^^^^^^^^^
@@ -1002,6 +1005,7 @@ Reshaping
10021005
- Bug in :func:`concat` was not allowing for concatenation of ``DataFrame`` and ``Series`` with duplicate keys (:issue:`33654`)
10031006
- Bug in :func:`cut` raised an error when non-unique labels (:issue:`33141`)
10041007
- Ensure only named functions can be used in :func:`eval()` (:issue:`32460`)
1008+
- Bug in :func:`Dataframe.aggregate` and :func:`Series.aggregate` was causing recursive loop in some cases (:issue:`34224`)
10051009
- Fixed bug in :func:`melt` where melting MultiIndex columns with ``col_level`` > 0 would raise a ``KeyError`` on ``id_vars`` (:issue:`34129`)
10061010

10071011
Sparse

pandas/_libs/index.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ cnp.import_array()
2222
from pandas._libs cimport util
2323

2424
from pandas._libs.tslibs.nattype cimport c_NaT as NaT
25-
from pandas._libs.tslibs.base cimport ABCTimedelta
2625
from pandas._libs.tslibs.period cimport is_period_object
2726
from pandas._libs.tslibs.timestamps cimport _Timestamp
27+
from pandas._libs.tslibs.timedeltas cimport _Timedelta
2828

2929
from pandas._libs.hashtable cimport HashTable
3030

@@ -471,7 +471,7 @@ cdef class TimedeltaEngine(DatetimeEngine):
471471
return 'm8[ns]'
472472

473473
cdef int64_t _unbox_scalar(self, scalar) except? -1:
474-
if not (isinstance(scalar, ABCTimedelta) or scalar is NaT):
474+
if not (isinstance(scalar, _Timedelta) or scalar is NaT):
475475
raise TypeError(scalar)
476476
return scalar.value
477477

pandas/_libs/interval.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,9 @@ from pandas._libs.tslibs.util cimport (
4242
is_timedelta64_object,
4343
)
4444

45-
from pandas._libs.tslibs.base cimport ABCTimedelta
4645
from pandas._libs.tslibs.timezones cimport tz_compare
4746
from pandas._libs.tslibs.timestamps cimport _Timestamp
47+
from pandas._libs.tslibs.timedeltas cimport _Timedelta
4848

4949
_VALID_CLOSED = frozenset(['left', 'right', 'both', 'neither'])
5050

@@ -340,7 +340,7 @@ cdef class Interval(IntervalMixin):
340340
def _validate_endpoint(self, endpoint):
341341
# GH 23013
342342
if not (is_integer_object(endpoint) or is_float_object(endpoint) or
343-
isinstance(endpoint, (_Timestamp, ABCTimedelta))):
343+
isinstance(endpoint, (_Timestamp, _Timedelta))):
344344
raise ValueError("Only numeric, Timestamp and Timedelta endpoints "
345345
"are allowed when constructing an Interval.")
346346

pandas/_libs/tslibs/__init__.py

+2
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,14 @@
1414
"ints_to_pytimedelta",
1515
"Timestamp",
1616
"tz_convert_single",
17+
"to_offset",
1718
]
1819

1920

2021
from .conversion import localize_pydatetime
2122
from .nattype import NaT, NaTType, iNaT, is_null_datetimelike, nat_strings
2223
from .np_datetime import OutOfBoundsDatetime
24+
from .offsets import to_offset
2325
from .period import IncompatibleFrequency, Period
2426
from .resolution import Resolution
2527
from .timedeltas import Timedelta, delta_to_nanoseconds, ints_to_pytimedelta

pandas/_libs/tslibs/base.pxd

+1-4
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
1-
from cpython.datetime cimport datetime, timedelta
2-
3-
cdef class ABCTimedelta(timedelta):
4-
pass
1+
from cpython.datetime cimport datetime
52

63

74
cdef class ABCTimestamp(datetime):

pandas/_libs/tslibs/base.pyx

+1-5
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,7 @@ in order to allow for fast isinstance checks without circular dependency issues.
55
This is analogous to core.dtypes.generic.
66
"""
77

8-
from cpython.datetime cimport datetime, timedelta
9-
10-
11-
cdef class ABCTimedelta(timedelta):
12-
pass
8+
from cpython.datetime cimport datetime
139

1410

1511
cdef class ABCTimestamp(datetime):

pandas/_libs/tslibs/offsets.pyx

+21-30
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,11 @@ from pandas._libs.tslibs.base cimport ABCTimestamp
3636
from pandas._libs.tslibs.ccalendar import (
3737
MONTH_ALIASES, MONTH_TO_CAL_NUM, weekday_to_int, int_to_weekday,
3838
)
39-
from pandas._libs.tslibs.ccalendar cimport get_days_in_month, dayofweek
39+
from pandas._libs.tslibs.ccalendar cimport DAY_NANOS, get_days_in_month, dayofweek
4040
from pandas._libs.tslibs.conversion cimport (
4141
convert_datetime_to_tsobject,
4242
localize_pydatetime,
43+
normalize_i8_timestamps,
4344
)
4445
from pandas._libs.tslibs.nattype cimport NPY_NAT, c_NaT as NaT
4546
from pandas._libs.tslibs.np_datetime cimport (
@@ -79,21 +80,14 @@ cdef bint _is_normalized(datetime dt):
7980
def apply_index_wraps(func):
8081
# Note: normally we would use `@functools.wraps(func)`, but this does
8182
# not play nicely with cython class methods
82-
def wrapper(self, other):
83-
84-
is_index = not util.is_array(other._data)
85-
86-
# operate on DatetimeArray
87-
arr = other._data if is_index else other
83+
def wrapper(self, other) -> np.ndarray:
84+
# other is a DatetimeArray
8885

89-
result = func(self, arr)
90-
91-
if is_index:
92-
# Wrap DatetimeArray result back to DatetimeIndex
93-
result = type(other)._simple_new(result, name=other.name)
86+
result = func(self, other)
87+
result = np.asarray(result)
9488

9589
if self.normalize:
96-
result = result.to_period('D').to_timestamp()
90+
result = normalize_i8_timestamps(result.view("i8"), None)
9791
return result
9892

9993
# do @functools.wraps(func) manually since it doesn't work on cdef funcs
@@ -1067,11 +1061,7 @@ cdef class RelativeDeltaOffset(BaseOffset):
10671061

10681062
weeks = kwds.get("weeks", 0) * self.n
10691063
if weeks:
1070-
# integer addition on PeriodIndex is deprecated,
1071-
# so we directly use _time_shift instead
1072-
asper = index.to_period("W")
1073-
shifted = asper._time_shift(weeks)
1074-
index = shifted.to_timestamp() + index.to_perioddelta("W")
1064+
index = index + timedelta(days=7 * weeks)
10751065

10761066
timedelta_kwds = {
10771067
k: v
@@ -1383,7 +1373,9 @@ cdef class BusinessDay(BusinessMixin):
13831373

13841374
@apply_index_wraps
13851375
def apply_index(self, dtindex):
1386-
time = dtindex.to_perioddelta("D")
1376+
i8other = dtindex.asi8
1377+
time = (i8other % DAY_NANOS).view("timedelta64[ns]")
1378+
13871379
# to_period rolls forward to next BDay; track and
13881380
# reduce n where it does when rolling forward
13891381
asper = dtindex.to_period("B")
@@ -1891,7 +1883,7 @@ cdef class YearOffset(SingleConstructorOffset):
18911883
shifted = shift_quarters(
18921884
dtindex.asi8, self.n, self.month, self._day_opt, modby=12
18931885
)
1894-
return type(dtindex)._simple_new(shifted, dtype=dtindex.dtype)
1886+
return shifted
18951887

18961888

18971889
cdef class BYearEnd(YearOffset):
@@ -2035,7 +2027,7 @@ cdef class QuarterOffset(SingleConstructorOffset):
20352027
shifted = shift_quarters(
20362028
dtindex.asi8, self.n, self.startingMonth, self._day_opt
20372029
)
2038-
return type(dtindex)._simple_new(shifted, dtype=dtindex.dtype)
2030+
return shifted
20392031

20402032

20412033
cdef class BQuarterEnd(QuarterOffset):
@@ -2141,7 +2133,7 @@ cdef class MonthOffset(SingleConstructorOffset):
21412133
@apply_index_wraps
21422134
def apply_index(self, dtindex):
21432135
shifted = shift_months(dtindex.asi8, self.n, self._day_opt)
2144-
return type(dtindex)._simple_new(shifted, dtype=dtindex.dtype)
2136+
return shifted
21452137

21462138
cpdef __setstate__(self, state):
21472139
state.pop("_use_relativedelta", False)
@@ -2276,6 +2268,7 @@ cdef class SemiMonthOffset(SingleConstructorOffset):
22762268
from pandas import Timedelta
22772269

22782270
dti = dtindex
2271+
i8other = dtindex.asi8
22792272
days_from_start = dtindex.to_perioddelta("M").asi8
22802273
delta = Timedelta(days=self.day_of_month - 1).value
22812274

@@ -2289,7 +2282,7 @@ cdef class SemiMonthOffset(SingleConstructorOffset):
22892282
roll = self._get_roll(dtindex, before_day_of_month, after_day_of_month)
22902283

22912284
# isolate the time since it will be striped away one the next line
2292-
time = dtindex.to_perioddelta("D")
2285+
time = (i8other % DAY_NANOS).view("timedelta64[ns]")
22932286

22942287
# apply the correct number of months
22952288

@@ -2504,12 +2497,9 @@ cdef class Week(SingleConstructorOffset):
25042497
@apply_index_wraps
25052498
def apply_index(self, dtindex):
25062499
if self.weekday is None:
2507-
# integer addition on PeriodIndex is deprecated,
2508-
# so we use _time_shift directly
2509-
asper = dtindex.to_period("W")
2510-
2511-
shifted = asper._time_shift(self.n)
2512-
return shifted.to_timestamp() + dtindex.to_perioddelta("W")
2500+
td = timedelta(days=7 * self.n)
2501+
td64 = np.timedelta64(td, "ns")
2502+
return dtindex + td64
25132503
else:
25142504
return self._end_apply_index(dtindex)
25152505

@@ -2529,7 +2519,8 @@ cdef class Week(SingleConstructorOffset):
25292519
from pandas import Timedelta
25302520
from .frequencies import get_freq_code # TODO: avoid circular import
25312521

2532-
off = dtindex.to_perioddelta("D")
2522+
i8other = dtindex.asi8
2523+
off = (i8other % DAY_NANOS).view("timedelta64")
25332524

25342525
base, mult = get_freq_code(self.freqstr)
25352526
base_period = dtindex.to_period(base)

pandas/_libs/tslibs/timedeltas.pxd

+12
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,18 @@
1+
from cpython.datetime cimport timedelta
12
from numpy cimport int64_t
23

34
# Exposed for tslib, not intended for outside use.
45
cpdef int64_t delta_to_nanoseconds(delta) except? -1
56
cdef convert_to_timedelta64(object ts, object unit)
67
cdef bint is_any_td_scalar(object obj)
8+
9+
10+
cdef class _Timedelta(timedelta):
11+
cdef readonly:
12+
int64_t value # nanoseconds
13+
object freq # frequency reference
14+
bint is_populated # are my components populated
15+
int64_t _d, _h, _m, _s, _ms, _us, _ns
16+
17+
cpdef timedelta to_pytimedelta(_Timedelta self)
18+
cpdef bint _has_ns(self)

0 commit comments

Comments
 (0)