Skip to content

Commit 8491a76

Browse files
authored
Merge branch 'master' into single_level
2 parents 23d5aa6 + 5920ee6 commit 8491a76

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+2853
-2416
lines changed

asv_bench/benchmarks/arithmetic.py

+25
Original file line numberDiff line numberDiff line change
@@ -469,4 +469,29 @@ def time_apply_index(self, offset):
469469
offset.apply_index(self.rng)
470470

471471

472+
class BinaryOpsMultiIndex:
473+
params = ["sub", "add", "mul", "div"]
474+
param_names = ["func"]
475+
476+
def setup(self, func):
477+
date_range = pd.date_range("20200101 00:00", "20200102 0:00", freq="S")
478+
level_0_names = [str(i) for i in range(30)]
479+
480+
index = pd.MultiIndex.from_product([level_0_names, date_range])
481+
column_names = ["col_1", "col_2"]
482+
483+
self.df = pd.DataFrame(
484+
np.random.rand(len(index), 2), index=index, columns=column_names
485+
)
486+
487+
self.arg_df = pd.DataFrame(
488+
np.random.randint(1, 10, (len(level_0_names), 2)),
489+
index=level_0_names,
490+
columns=column_names,
491+
)
492+
493+
def time_binary_op_multiindex(self, func):
494+
getattr(self.df, func)(self.arg_df, level=0)
495+
496+
472497
from .pandas_vb_common import setup # noqa: F401 isort:skip

doc/source/reference/offset_frequency.rst

+7-34
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,7 @@ Methods
134134
.. autosummary::
135135
:toctree: api/
136136

137+
CustomBusinessDay.apply_index
137138
CustomBusinessDay.apply
138139
CustomBusinessDay.copy
139140
CustomBusinessDay.isAnchored
@@ -381,40 +382,6 @@ Methods
381382
CustomBusinessMonthBegin.is_on_offset
382383
CustomBusinessMonthBegin.__call__
383384

384-
SemiMonthOffset
385-
---------------
386-
.. autosummary::
387-
:toctree: api/
388-
389-
SemiMonthOffset
390-
391-
Properties
392-
~~~~~~~~~~
393-
.. autosummary::
394-
:toctree: api/
395-
396-
SemiMonthOffset.freqstr
397-
SemiMonthOffset.kwds
398-
SemiMonthOffset.name
399-
SemiMonthOffset.nanos
400-
SemiMonthOffset.normalize
401-
SemiMonthOffset.rule_code
402-
SemiMonthOffset.n
403-
404-
Methods
405-
~~~~~~~
406-
.. autosummary::
407-
:toctree: api/
408-
409-
SemiMonthOffset.apply
410-
SemiMonthOffset.apply_index
411-
SemiMonthOffset.copy
412-
SemiMonthOffset.isAnchored
413-
SemiMonthOffset.onOffset
414-
SemiMonthOffset.is_anchored
415-
SemiMonthOffset.is_on_offset
416-
SemiMonthOffset.__call__
417-
418385
SemiMonthEnd
419386
------------
420387
.. autosummary::
@@ -434,6 +401,7 @@ Properties
434401
SemiMonthEnd.normalize
435402
SemiMonthEnd.rule_code
436403
SemiMonthEnd.n
404+
SemiMonthEnd.day_of_month
437405

438406
Methods
439407
~~~~~~~
@@ -468,6 +436,7 @@ Properties
468436
SemiMonthBegin.normalize
469437
SemiMonthBegin.rule_code
470438
SemiMonthBegin.n
439+
SemiMonthBegin.day_of_month
471440

472441
Methods
473442
~~~~~~~
@@ -502,6 +471,7 @@ Properties
502471
Week.normalize
503472
Week.rule_code
504473
Week.n
474+
Week.weekday
505475

506476
Methods
507477
~~~~~~~
@@ -536,6 +506,7 @@ Properties
536506
WeekOfMonth.normalize
537507
WeekOfMonth.rule_code
538508
WeekOfMonth.n
509+
WeekOfMonth.week
539510

540511
Methods
541512
~~~~~~~
@@ -571,6 +542,7 @@ Properties
571542
LastWeekOfMonth.rule_code
572543
LastWeekOfMonth.n
573544
LastWeekOfMonth.weekday
545+
LastWeekOfMonth.week
574546

575547
Methods
576548
~~~~~~~
@@ -922,6 +894,7 @@ Properties
922894
FY5253Quarter.normalize
923895
FY5253Quarter.rule_code
924896
FY5253Quarter.n
897+
FY5253Quarter.qtr_with_extra_week
925898
FY5253Quarter.startingMonth
926899
FY5253Quarter.variation
927900
FY5253Quarter.weekday

doc/source/user_guide/text.rst

+23
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,29 @@ Or ``astype`` after the ``Series`` or ``DataFrame`` is created
6363
s
6464
s.astype("string")
6565
66+
67+
.. versionchanged:: 1.1.0
68+
69+
You can also use :class:`StringDtype`/``"string"`` as the dtype on non-string data and
70+
it will be converted to ``string`` dtype:
71+
72+
.. ipython:: python
73+
74+
s = pd.Series(['a', 2, np.nan], dtype="string")
75+
s
76+
type(s[1])
77+
78+
or convert from existing pandas data:
79+
80+
.. ipython:: python
81+
82+
s1 = pd.Series([1, 2, np.nan], dtype="Int64")
83+
s1
84+
s2 = s1.astype("string")
85+
s2
86+
type(s2[0])
87+
88+
6689
.. _text.differences:
6790

6891
Behavior differences

doc/source/user_guide/timeseries.rst

+1
Original file line numberDiff line numberDiff line change
@@ -793,6 +793,7 @@ You may obtain the year, week and day components of the ISO year from the ISO 86
793793
.. ipython:: python
794794
795795
idx = pd.date_range(start='2019-12-29', freq='D', periods=4)
796+
idx.isocalendar()
796797
idx.to_series().dt.isocalendar()
797798
798799
.. _timeseries.offsets:

doc/source/whatsnew/v1.1.0.rst

+76-2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,24 @@ including other versions of pandas.
1313
Enhancements
1414
~~~~~~~~~~~~
1515

16+
.. _whatsnew_110.astype_string:
17+
18+
All dtypes can now be converted to ``StringDtype``
19+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20+
21+
Previously, declaring or converting to :class:`StringDtype` was in general only possible if the data was already only ``str`` or nan-like (:issue:`31204`).
22+
:class:`StringDtype` now works in all situations where ``astype(str)`` or ``dtype=str`` work:
23+
24+
For example, the below now works:
25+
26+
.. ipython:: python
27+
28+
ser = pd.Series([1, "abc", np.nan], dtype="string")
29+
ser
30+
ser[0]
31+
pd.Series([1, 2, np.nan], dtype="Int64").astype("string")
32+
33+
1634
.. _whatsnew_110.period_index_partial_string_slicing:
1735

1836
Nonmonotonic PeriodIndex Partial String Slicing
@@ -209,7 +227,7 @@ Other enhancements
209227
- :class:`Series.str` now has a `fullmatch` method that matches a regular expression against the entire string in each row of the series, similar to `re.fullmatch` (:issue:`32806`).
210228
- :meth:`DataFrame.sample` will now also allow array-like and BitGenerator objects to be passed to ``random_state`` as seeds (:issue:`32503`)
211229
- :meth:`MultiIndex.union` will now raise `RuntimeWarning` if the object inside are unsortable, pass `sort=False` to suppress this warning (:issue:`33015`)
212-
- :class:`Series.dt` and :class:`DatatimeIndex` now have an `isocalendar` method that returns a :class:`DataFrame` with year, week, and day calculated according to the ISO 8601 calendar (:issue:`33206`).
230+
- :class:`Series.dt` and :class:`DatatimeIndex` now have an `isocalendar` method that returns a :class:`DataFrame` with year, week, and day calculated according to the ISO 8601 calendar (:issue:`33206`, :issue:`34392`).
213231
- The :meth:`DataFrame.to_feather` method now supports additional keyword
214232
arguments (e.g. to set the compression) that are added in pyarrow 0.17
215233
(:issue:`33422`).
@@ -565,6 +583,53 @@ Assignment to multiple columns of a :class:`DataFrame` when some of the columns
565583
df[['a', 'c']] = 1
566584
df
567585
586+
.. _whatsnew_110.api_breaking.groupby_consistency:
587+
588+
Consistency across groupby reductions
589+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
590+
591+
Using :meth:`DataFrame.groupby` with ``as_index=True`` and the aggregation ``nunique`` would include the grouping column(s) in the columns of the result. Now the grouping column(s) only appear in the index, consistent with other reductions. (:issue:`32579`)
592+
593+
.. ipython:: python
594+
595+
df = pd.DataFrame({"a": ["x", "x", "y", "y"], "b": [1, 1, 2, 3]})
596+
df
597+
598+
*Previous behavior*:
599+
600+
.. code-block:: ipython
601+
602+
In [3]: df.groupby("a", as_index=True).nunique()
603+
Out[4]:
604+
a b
605+
a
606+
x 1 1
607+
y 1 2
608+
609+
*New behavior*:
610+
611+
.. ipython:: python
612+
613+
df.groupby("a", as_index=True).nunique()
614+
615+
Using :meth:`DataFrame.groupby` with ``as_index=False`` and the function ``idxmax``, ``idxmin``, ``mad``, ``nunique``, ``sem``, ``skew``, or ``std`` would modify the grouping column. Now the grouping column remains unchanged, consistent with other reductions. (:issue:`21090`, :issue:`10355`)
616+
617+
*Previous behavior*:
618+
619+
.. code-block:: ipython
620+
621+
In [3]: df.groupby("a", as_index=False).nunique()
622+
Out[4]:
623+
a b
624+
0 1 1
625+
1 1 2
626+
627+
*New behavior*:
628+
629+
.. ipython:: python
630+
631+
df.groupby("a", as_index=False).nunique()
632+
568633
.. _whatsnew_110.deprecations:
569634

570635
Deprecations
@@ -590,6 +655,9 @@ Deprecations
590655

591656
- :func:`pandas.api.types.is_categorical` is deprecated and will be removed in a future version; use `:func:pandas.api.types.is_categorical_dtype` instead (:issue:`33385`)
592657
- :meth:`Index.get_value` is deprecated and will be removed in a future version (:issue:`19728`)
658+
- :meth:`Series.dt.week` and `Series.dt.weekofyear` are deprecated and will be removed in a future version, use :meth:`Series.dt.isocalendar().week` instead (:issue:`33595`)
659+
- :meth:`DatetimeIndex.week` and `DatetimeIndex.weekofyear` are deprecated and will be removed in a future version, use :meth:`DatetimeIndex.isocalendar().week` instead (:issue:`33595`)
660+
- :meth:`DatetimeArray.week` and `DatetimeArray.weekofyear` are deprecated and will be removed in a future version, use :meth:`DatetimeArray.isocalendar().week` instead (:issue:`33595`)
593661
- :meth:`DateOffset.__call__` is deprecated and will be removed in a future version, use ``offset + other`` instead (:issue:`34171`)
594662
- Indexing an :class:`Index` object with a float key is deprecated, and will
595663
raise an ``IndexError`` in the future. You can manually convert to an integer key
@@ -621,6 +689,7 @@ Performance improvements
621689
- Performance improvement in reductions (sum, prod, min, max) for nullable (integer and boolean) dtypes (:issue:`30982`, :issue:`33261`, :issue:`33442`).
622690
- Performance improvement in arithmetic operations between two :class:`DataFrame` objects (:issue:`32779`)
623691
- Performance improvement in :class:`pandas.core.groupby.RollingGroupby` (:issue:`34052`)
692+
- Performance improvement in arithmetic operations (sub, add, mul, div) for MultiIndex (:issue:`34297`)
624693

625694
.. ---------------------------------------------------------------------------
626695
@@ -802,6 +871,7 @@ I/O
802871
- Bug in :meth:`~DataFrame.read_feather` was raising an `ArrowIOError` when reading an s3 or http file path (:issue:`29055`)
803872
- Bug in :meth:`read_parquet` was raising a ``FileNotFoundError`` when passed an s3 directory path. (:issue:`26388`)
804873
- Bug in :meth:`~DataFrame.to_parquet` was throwing an ``AttributeError`` when writing a partitioned parquet file to s3 (:issue:`27596`)
874+
- Bug in :meth:`~DataFrame.to_excel` could not handle the column name `render` and was raising an ``KeyError`` (:issue:`34331`)
805875
- Bug in :meth:`~DataFrame.to_json` with 'table' orient was writting wrong index field name for MultiIndex Dataframe with a single level. (:issue:`29928`)
806876

807877
Plotting
@@ -833,7 +903,10 @@ Groupby/resample/rolling
833903
- Bug in :meth:`Series.groupby` would raise ``ValueError`` when grouping by :class:`PeriodIndex` level (:issue:`34010`)
834904
- Bug in :meth:`GroupBy.agg`, :meth:`GroupBy.transform`, and :meth:`GroupBy.resample` where subclasses are not preserved (:issue:`28330`)
835905
- Bug in :meth:`GroupBy.rolling.apply` ignores args and kwargs parameters (:issue:`33433`)
836-
- Bug in :meth:`DataFrameGroupby.std` and :meth:`DataFrameGroupby.sem` would modify grouped-by columns when ``as_index=False`` (:issue:`10355`)
906+
- Bug in :meth:`core.groupby.DataFrameGroupBy.apply` where the output index shape for functions returning a DataFrame which is equally indexed
907+
to the input DataFrame is inconsistent. An internal heuristic to detect index mutation would behave differently for equal but not identical
908+
indices. In particular, the result index shape might change if a copy of the input would be returned.
909+
The behaviour now is consistent, independent of internal heuristics. (:issue:`31612`, :issue:`14927`, :issue:`13056`)
837910

838911
Reshaping
839912
^^^^^^^^^
@@ -863,6 +936,7 @@ Reshaping
863936
- Bug in :func:`cut` raised an error when non-unique labels (:issue:`33141`)
864937
- Bug in :meth:`DataFrame.replace` casts columns to ``object`` dtype if items in ``to_replace`` not in values (:issue:`32988`)
865938
- Ensure only named functions can be used in :func:`eval()` (:issue:`32460`)
939+
- Fixed bug in :func:`melt` where melting MultiIndex columns with ``col_level`` > 0 would raise a ``KeyError`` on ``id_vars`` (:issue:`34129`)
866940

867941
Sparse
868942
^^^^^^

pandas/_libs/reduction.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -502,7 +502,7 @@ def apply_frame_axis0(object frame, object f, object names,
502502
# Need to infer if low level index slider will cause segfaults
503503
require_slow_apply = i == 0 and piece is chunk
504504
try:
505-
if piece.index is not chunk.index:
505+
if not piece.index.equals(chunk.index):
506506
mutated = True
507507
except AttributeError:
508508
# `piece` might not have an index, could be e.g. an int

0 commit comments

Comments
 (0)