Skip to content

Commit b081341

Browse files
DOC: fixed merge conflicts
2 parents b2f2bb6 + cf12e67 commit b081341

File tree

9 files changed

+203
-36
lines changed

9 files changed

+203
-36
lines changed

.github/actions/setup-conda/action.yml

+2
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ runs:
99
- name: Install ${{ inputs.environment-file }}
1010
uses: mamba-org/setup-micromamba@v1
1111
with:
12+
# Pinning to avoid 2.0 failures
13+
micromamba-version: '1.5.10-0'
1214
environment-file: ${{ inputs.environment-file }}
1315
environment-name: test
1416
condarc-file: ci/.condarc

ci/code_checks.sh

-3
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
9797
-i "pandas.Series.dt.unit GL08" \
9898
-i "pandas.Series.pad PR01,SA01" \
9999
-i "pandas.Series.sparse.from_coo PR07,SA01" \
100-
-i "pandas.Series.sparse.npoints SA01" \
101100
-i "pandas.Timedelta.max PR02" \
102101
-i "pandas.Timedelta.min PR02" \
103102
-i "pandas.Timedelta.resolution PR02" \
@@ -128,8 +127,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
128127
-i "pandas.arrays.SparseArray PR07,SA01" \
129128
-i "pandas.arrays.TimedeltaArray PR07,SA01" \
130129
-i "pandas.core.groupby.DataFrameGroupBy.__iter__ RT03,SA01" \
131-
-i "pandas.core.groupby.DataFrameGroupBy.agg RT03" \
132-
-i "pandas.core.groupby.DataFrameGroupBy.aggregate RT03" \
133130
-i "pandas.core.groupby.DataFrameGroupBy.boxplot PR07,RT03,SA01" \
134131
-i "pandas.core.groupby.DataFrameGroupBy.get_group RT03,SA01" \
135132
-i "pandas.core.groupby.DataFrameGroupBy.groups SA01" \

doc/source/development/contributing.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -305,15 +305,15 @@ It is important to periodically update your local ``main`` branch with updates f
305305
branch and update your development environment to reflect any changes to the various packages that
306306
are used during development.
307307

308-
If using :ref:`mamba <contributing.mamba>`, run:
308+
If using :ref:`conda <contributing.conda>`, run:
309309

310310
.. code-block:: shell
311311
312312
git checkout main
313313
git fetch upstream
314314
git merge upstream/main
315-
mamba activate pandas-dev
316-
mamba env update -f environment.yml --prune
315+
conda activate pandas-dev
316+
conda env update -f environment.yml --prune
317317
318318
If using :ref:`pip <contributing.pip>` , do:
319319

doc/source/development/contributing_codebase.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -244,7 +244,7 @@ in your python environment.
244244

245245
.. warning::
246246

247-
* Please be aware that the above commands will use the current python environment. If your python packages are older/newer than those installed by the pandas CI, the above commands might fail. This is often the case when the ``mypy`` or ``numpy`` versions do not match. Please see :ref:`how to setup the python environment <contributing.mamba>` or select a `recently succeeded workflow <https://github.com/pandas-dev/pandas/actions/workflows/code-checks.yml?query=branch%3Amain+is%3Asuccess>`_, select the "Docstring validation, typing, and other manual pre-commit hooks" job, then click on "Set up Conda" and "Environment info" to see which versions the pandas CI installs.
247+
* Please be aware that the above commands will use the current python environment. If your python packages are older/newer than those installed by the pandas CI, the above commands might fail. This is often the case when the ``mypy`` or ``numpy`` versions do not match. Please see :ref:`how to setup the python environment <contributing.conda>` or select a `recently succeeded workflow <https://github.com/pandas-dev/pandas/actions/workflows/code-checks.yml?query=branch%3Amain+is%3Asuccess>`_, select the "Docstring validation, typing, and other manual pre-commit hooks" job, then click on "Set up Conda" and "Environment info" to see which versions the pandas CI installs.
248248

249249
.. _contributing.ci:
250250

doc/source/development/contributing_environment.rst

+11-12
Original file line numberDiff line numberDiff line change
@@ -43,17 +43,17 @@ and consult the ``Linux`` instructions below.
4343

4444
**macOS**
4545

46-
To use the :ref:`mamba <contributing.mamba>`-based compilers, you will need to install the
46+
To use the :ref:`conda <contributing.conda>`-based compilers, you will need to install the
4747
Developer Tools using ``xcode-select --install``.
4848

4949
If you prefer to use a different compiler, general information can be found here:
5050
https://devguide.python.org/setup/#macos
5151

5252
**Linux**
5353

54-
For Linux-based :ref:`mamba <contributing.mamba>` installations, you won't have to install any
55-
additional components outside of the mamba environment. The instructions
56-
below are only needed if your setup isn't based on mamba environments.
54+
For Linux-based :ref:`conda <contributing.conda>` installations, you won't have to install any
55+
additional components outside of the conda environment. The instructions
56+
below are only needed if your setup isn't based on conda environments.
5757

5858
Some Linux distributions will come with a pre-installed C compiler. To find out
5959
which compilers (and versions) are installed on your system::
@@ -82,19 +82,18 @@ Before we begin, please:
8282
* Make sure that you have :any:`cloned the repository <contributing.forking>`
8383
* ``cd`` to the pandas source directory you just created with the clone command
8484

85-
.. _contributing.mamba:
85+
.. _contributing.conda:
8686

87-
Option 1: using mamba (recommended)
87+
Option 1: using conda (recommended)
8888
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8989

90-
* Install miniforge to get `mamba <https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html>`_
91-
* Make sure your mamba is up to date (``mamba update mamba``)
92-
* Create and activate the ``pandas-dev`` mamba environment using the following commands:
90+
* Install miniforge to get `conda <https://github.com/conda-forge/miniforge?tab=readme-ov-file#download>`_
91+
* Create and activate the ``pandas-dev`` conda environment using the following commands:
9392

94-
.. code-block:: none
93+
.. code-block:: bash
9594
96-
mamba env create --file environment.yml
97-
mamba activate pandas-dev
95+
conda env create --file environment.yml
96+
conda activate pandas-dev
9897
9998
.. _contributing.pip:
10099

pandas/core/arrays/sparse/array.py

+12
Original file line numberDiff line numberDiff line change
@@ -708,6 +708,18 @@ def npoints(self) -> int:
708708
"""
709709
The number of non- ``fill_value`` points.
710710
711+
This property returns the number of elements in the sparse series that are
712+
not equal to the ``fill_value``. Sparse data structures store only the
713+
non-``fill_value`` elements, reducing memory usage when the majority of
714+
values are the same.
715+
716+
See Also
717+
--------
718+
Series.sparse.to_dense : Convert a Series from sparse values to dense.
719+
Series.sparse.fill_value : Elements in ``data`` that are ``fill_value`` are
720+
not stored.
721+
Series.sparse.density : The percent of non- ``fill_value`` points, as decimal.
722+
711723
Examples
712724
--------
713725
>>> from pandas.arrays import SparseArray

pandas/core/groupby/generic.py

+174-2
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,6 @@
6767
from pandas.core.groupby.groupby import (
6868
GroupBy,
6969
GroupByPlot,
70-
_agg_template_frame,
7170
_transform_template,
7271
)
7372
from pandas.core.indexes.api import (
@@ -1647,8 +1646,181 @@ class DataFrameGroupBy(GroupBy[DataFrame]):
16471646
"""
16481647
)
16491648

1650-
@doc(_agg_template_frame, examples=_agg_examples_doc, klass="DataFrame")
16511649
def aggregate(self, func=None, *args, engine=None, engine_kwargs=None, **kwargs):
1650+
"""
1651+
Aggregate using one or more operations.
1652+
1653+
The ``aggregate`` function allows the application of one or more aggregation
1654+
operations on groups of data within a DataFrameGroupBy object. It supports
1655+
various aggregation methods, including user-defined functions and predefined
1656+
functions such as 'sum', 'mean', etc.
1657+
1658+
Parameters
1659+
----------
1660+
func : function, str, list, dict or None
1661+
Function to use for aggregating the data. If a function, must either
1662+
work when passed a DataFrame or when passed to DataFrame.apply.
1663+
1664+
Accepted combinations are:
1665+
1666+
- function
1667+
- string function name
1668+
- list of functions and/or function names, e.g. ``[np.sum, 'mean']``
1669+
- dict of index labels -> functions, function names or list of such.
1670+
- None, in which case ``**kwargs`` are used with Named Aggregation. Here the
1671+
output has one column for each element in ``**kwargs``. The name of the
1672+
column is keyword, whereas the value determines the aggregation used to
1673+
compute the values in the column.
1674+
1675+
Can also accept a Numba JIT function with
1676+
``engine='numba'`` specified. Only passing a single function is supported
1677+
with this engine.
1678+
1679+
If the ``'numba'`` engine is chosen, the function must be
1680+
a user defined function with ``values`` and ``index`` as the
1681+
first and second arguments respectively in the function signature.
1682+
Each group's index will be passed to the user defined function
1683+
and optionally available for use.
1684+
1685+
*args
1686+
Positional arguments to pass to func.
1687+
engine : str, default None
1688+
* ``'cython'`` : Runs the function through C-extensions from cython.
1689+
* ``'numba'`` : Runs the function through JIT compiled code from numba.
1690+
* ``None`` : Defaults to ``'cython'`` or globally setting
1691+
``compute.use_numba``
1692+
1693+
engine_kwargs : dict, default None
1694+
* For ``'cython'`` engine, there are no accepted ``engine_kwargs``
1695+
* For ``'numba'`` engine, the engine can accept ``nopython``, ``nogil``
1696+
and ``parallel`` dictionary keys. The values must either be ``True`` or
1697+
``False``. The default ``engine_kwargs`` for the ``'numba'`` engine is
1698+
``{'nopython': True, 'nogil': False, 'parallel': False}`` and will be
1699+
applied to the function
1700+
1701+
**kwargs
1702+
* If ``func`` is None, ``**kwargs`` are used to define the output names and
1703+
aggregations via Named Aggregation. See ``func`` entry.
1704+
* Otherwise, keyword arguments to be passed into func.
1705+
1706+
Returns
1707+
-------
1708+
DataFrame
1709+
Aggregated DataFrame based on the grouping and the applied aggregation
1710+
functions.
1711+
1712+
See Also
1713+
--------
1714+
DataFrame.groupby.apply : Apply function func group-wise
1715+
and combine the results together.
1716+
DataFrame.groupby.transform : Transforms the Series on each group
1717+
based on the given function.
1718+
DataFrame.aggregate : Aggregate using one or more operations.
1719+
1720+
Notes
1721+
-----
1722+
When using ``engine='numba'``, there will be no "fall back" behavior internally.
1723+
The group data and group index will be passed as numpy arrays to the JITed
1724+
user defined function, and no alternative execution attempts will be tried.
1725+
1726+
Functions that mutate the passed object can produce unexpected
1727+
behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`
1728+
for more details.
1729+
1730+
.. versionchanged:: 1.3.0
1731+
1732+
The resulting dtype will reflect the return value of the passed ``func``,
1733+
see the examples below.
1734+
1735+
Examples
1736+
--------
1737+
>>> data = {
1738+
... "A": [1, 1, 2, 2],
1739+
... "B": [1, 2, 3, 4],
1740+
... "C": [0.362838, 0.227877, 1.267767, -0.562860],
1741+
... }
1742+
>>> df = pd.DataFrame(data)
1743+
>>> df
1744+
A B C
1745+
0 1 1 0.362838
1746+
1 1 2 0.227877
1747+
2 2 3 1.267767
1748+
3 2 4 -0.562860
1749+
1750+
The aggregation is for each column.
1751+
1752+
>>> df.groupby("A").agg("min")
1753+
B C
1754+
A
1755+
1 1 0.227877
1756+
2 3 -0.562860
1757+
1758+
Multiple aggregations
1759+
1760+
>>> df.groupby("A").agg(["min", "max"])
1761+
B C
1762+
min max min max
1763+
A
1764+
1 1 2 0.227877 0.362838
1765+
2 3 4 -0.562860 1.267767
1766+
1767+
Select a column for aggregation
1768+
1769+
>>> df.groupby("A").B.agg(["min", "max"])
1770+
min max
1771+
A
1772+
1 1 2
1773+
2 3 4
1774+
1775+
User-defined function for aggregation
1776+
1777+
>>> df.groupby("A").agg(lambda x: sum(x) + 2)
1778+
B C
1779+
A
1780+
1 5 2.590715
1781+
2 9 2.704907
1782+
1783+
Different aggregations per column
1784+
1785+
>>> df.groupby("A").agg({"B": ["min", "max"], "C": "sum"})
1786+
B C
1787+
min max sum
1788+
A
1789+
1 1 2 0.590715
1790+
2 3 4 0.704907
1791+
1792+
To control the output names with different aggregations per column,
1793+
pandas supports "named aggregation"
1794+
1795+
>>> df.groupby("A").agg(
1796+
... b_min=pd.NamedAgg(column="B", aggfunc="min"),
1797+
... c_sum=pd.NamedAgg(column="C", aggfunc="sum"),
1798+
... )
1799+
b_min c_sum
1800+
A
1801+
1 1 0.590715
1802+
2 3 0.704907
1803+
1804+
- The keywords are the *output* column names
1805+
- The values are tuples whose first element is the column to select
1806+
and the second element is the aggregation to apply to that column.
1807+
Pandas provides the ``pandas.NamedAgg`` namedtuple with the fields
1808+
``['column', 'aggfunc']`` to make it clearer what the arguments are.
1809+
As usual, the aggregation can be a callable or a string alias.
1810+
1811+
See :ref:`groupby.aggregate.named` for more.
1812+
1813+
.. versionchanged:: 1.3.0
1814+
1815+
The resulting dtype will reflect the return value of the aggregating
1816+
function.
1817+
1818+
>>> df.groupby("A")[["B"]].agg(lambda x: x.astype(float).min())
1819+
B
1820+
A
1821+
1 1.0
1822+
2 3.0
1823+
"""
16521824
relabeling, func, columns, order = reconstruct_func(func, **kwargs)
16531825
func = maybe_mangle_lambdas(func)
16541826

pandas/core/groupby/groupby.py

-14
Original file line numberDiff line numberDiff line change
@@ -366,15 +366,12 @@ class providing the base-class of operations.
366366

367367
_agg_template_frame = """
368368
Aggregate using one or more operations.
369-
370369
Parameters
371370
----------
372371
func : function, str, list, dict or None
373372
Function to use for aggregating the data. If a function, must either
374373
work when passed a {klass} or when passed to {klass}.apply.
375-
376374
Accepted combinations are:
377-
378375
- function
379376
- string function name
380377
- list of functions and/or function names, e.g. ``[np.sum, 'mean']``
@@ -383,61 +380,50 @@ class providing the base-class of operations.
383380
output has one column for each element in ``**kwargs``. The name of the
384381
column is keyword, whereas the value determines the aggregation used to compute
385382
the values in the column.
386-
387383
Can also accept a Numba JIT function with
388384
``engine='numba'`` specified. Only passing a single function is supported
389385
with this engine.
390-
391386
If the ``'numba'`` engine is chosen, the function must be
392387
a user defined function with ``values`` and ``index`` as the
393388
first and second arguments respectively in the function signature.
394389
Each group's index will be passed to the user defined function
395390
and optionally available for use.
396-
397391
*args
398392
Positional arguments to pass to func.
399393
engine : str, default None
400394
* ``'cython'`` : Runs the function through C-extensions from cython.
401395
* ``'numba'`` : Runs the function through JIT compiled code from numba.
402396
* ``None`` : Defaults to ``'cython'`` or globally setting ``compute.use_numba``
403-
404397
engine_kwargs : dict, default None
405398
* For ``'cython'`` engine, there are no accepted ``engine_kwargs``
406399
* For ``'numba'`` engine, the engine can accept ``nopython``, ``nogil``
407400
and ``parallel`` dictionary keys. The values must either be ``True`` or
408401
``False``. The default ``engine_kwargs`` for the ``'numba'`` engine is
409402
``{{'nopython': True, 'nogil': False, 'parallel': False}}`` and will be
410403
applied to the function
411-
412404
**kwargs
413405
* If ``func`` is None, ``**kwargs`` are used to define the output names and
414406
aggregations via Named Aggregation. See ``func`` entry.
415407
* Otherwise, keyword arguments to be passed into func.
416-
417408
Returns
418409
-------
419410
{klass}
420-
421411
See Also
422412
--------
423413
{klass}.groupby.apply : Apply function func group-wise
424414
and combine the results together.
425415
{klass}.groupby.transform : Transforms the Series on each group
426416
based on the given function.
427417
{klass}.aggregate : Aggregate using one or more operations.
428-
429418
Notes
430419
-----
431420
When using ``engine='numba'``, there will be no "fall back" behavior internally.
432421
The group data and group index will be passed as numpy arrays to the JITed
433422
user defined function, and no alternative execution attempts will be tried.
434-
435423
Functions that mutate the passed object can produce unexpected
436424
behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`
437425
for more details.
438-
439426
.. versionchanged:: 1.3.0
440-
441427
The resulting dtype will reflect the return value of the passed ``func``,
442428
see the examples below.
443429
{examples}"""

scripts/validate_unwanted_patterns.py

-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@
2929
"_shared_docs",
3030
"_new_Index",
3131
"_new_PeriodIndex",
32-
"_agg_template_frame",
3332
"_pipe_template",
3433
"_apply_groupings_depr",
3534
"__main__",

0 commit comments

Comments
 (0)