Skip to content

Commit 83f0499

Browse files
committed
DOC/TST: test for deprecation in .agg
additional doc updates
1 parent ca87568 commit 83f0499

File tree

9 files changed

+79
-64
lines changed

9 files changed

+79
-64
lines changed

doc/source/basics.rst

+31-42
Original file line numberDiff line numberDiff line change
@@ -843,10 +843,11 @@ Aggregation API
843843
.. versionadded:: 0.20.0
844844

845845
The aggregation API allows one to express possibly multiple aggregation operations in a single concise way.
846-
This API is similar across pandas objects, :ref:`groupby aggregates <groupby.aggregate>`,
847-
:ref:`window functions <stats.aggregate>`, and the :ref:`resample API <timeseries.aggregate>`.
846+
This API is similar across pandas objects, see :ref:`groupby API <groupby.aggregate>`, the
847+
:ref:`window functions API <stats.aggregate>`, and the :ref:`resample API <timeseries.aggregate>`.
848+
The entry point for aggregation is the method :meth:`~DataFrame.aggregate`, or the alias :meth:`~DataFrame.agg`.
848849

849-
We will use a similar starting frame from above.
850+
We will use a similar starting frame from above:
850851

851852
.. ipython:: python
852853
@@ -855,8 +856,8 @@ We will use a similar starting frame from above.
855856
tsdf.iloc[3:7] = np.nan
856857
tsdf
857858
858-
Using a single function is equivalent to ``.apply``; You can also pass named methods as strings.
859-
This will return a Series of the output.
859+
Using a single function is equivalent to :meth:`~DataFrame.apply`; You can also pass named methods as strings.
860+
These will return a ``Series`` of the aggregated output:
860861

861862
.. ipython:: python
862863
@@ -867,72 +868,68 @@ This will return a Series of the output.
867868
# these are equivalent to a ``.sum()`` because we are aggregating on a single function
868869
tsdf.sum()
869870
870-
On a Series this will result in a scalar value
871+
Single aggregations on a ``Series`` this will result in a scalar value:
871872

872873
.. ipython:: python
873874
874875
tsdf.A.agg('sum')
875876
876877
877-
Aggregating multiple functions at once
878-
++++++++++++++++++++++++++++++++++++++
878+
Aggregating with multiple functions
879+
+++++++++++++++++++++++++++++++++++
879880

880-
You can pass arguments as a list. The results of each of the passed functions will be a row in the resultant DataFrame.
881+
You can pass multiple aggregation arguments as a list.
882+
The results of each of the passed functions will be a row in the resultant ``DataFrame``.
881883
These are naturally named from the aggregation function.
882884

883885
.. ipython:: python
884886
885887
tsdf.agg(['sum'])
886888
887-
Multiple functions yield multiple rows.
889+
Multiple functions yield multiple rows:
888890

889891
.. ipython:: python
890892
891893
tsdf.agg(['sum', 'mean'])
892894
893-
On a Series, multiple functions return a Series, indexed by the function names.
895+
On a ``Series``, multiple functions return a ``Series``, indexed by the function names:
894896

895897
.. ipython:: python
896898
897899
tsdf.A.agg(['sum', 'mean'])
898900
899-
900-
Aggregating with a dict of functions
901-
++++++++++++++++++++++++++++++++++++
902-
903-
Passing a dictionary of column name to function or list of functions, to ``DataFame.agg``
904-
allows you to customize which functions are applied to which columns.
901+
Passing a ``lambda`` function will yield a ``<lambda>`` named row:
905902

906903
.. ipython:: python
907904
908-
tsdf.agg({'A': 'mean', 'B': 'sum'})
905+
tsdf.A.agg(['sum', lambda x: x.mean()])
909906
910-
Passing a list-like will generate a DataFrame output. You will get a matrix-like output
911-
of all of the aggregators; some may be missing values.
907+
Passing a named function will yield that name for the row:
912908

913909
.. ipython:: python
914910
915-
tsdf.agg({'A': ['mean', 'min'], 'B': 'sum'})
916-
917-
For a Series, you can pass a dict. You will get back a MultiIndex Series; The outer level will
918-
be the keys, the inner the name of the functions.
911+
def mymean(x):
912+
return x.mean()
919913
920-
.. ipython:: python
914+
tsdf.A.agg(['sum', mymean])
921915
922-
tsdf.A.agg({'foo': ['sum', 'mean']})
916+
Aggregating with a dict
917+
+++++++++++++++++++++++
923918

924-
Alternatively, using multiple dictionaries, you can have renamed elements with the aggregation
919+
Passing a dictionary of column names to a scalar or a list of scalars, to ``DataFame.agg``
920+
allows you to customize which functions are applied to which columns.
925921

926922
.. ipython:: python
927923
928-
tsdf.A.agg({'foo': 'sum', 'bar': 'mean'})
924+
tsdf.agg({'A': 'mean', 'B': 'sum'})
929925
930-
Multiple keys will yield a MultiIndex Series. The outer level will be the keys, the inner
931-
the names of the functions.
926+
Passing a list-like will generate a ``DataFrame`` output. You will get a matrix-like output
927+
of all of the aggregators. The output will consist of all unique functions. Those that are
928+
not noted for a particular column will be ``NaN``:
932929

933930
.. ipython:: python
934931
935-
tsdf.A.agg({'foo': ['sum', 'mean'], 'bar': ['min', 'max', lambda x: x.sum()+1]})
932+
tsdf.agg({'A': ['mean', 'min'], 'B': 'sum'})
936933
937934
.. _basics.aggregation.mixed_dtypes:
938935

@@ -980,7 +977,7 @@ Transform API
980977

981978
.. versionadded:: 0.20.0
982979

983-
The ``transform`` method returns an object that is indexed the same (same size)
980+
The :method:`~DataFrame.transform` method returns an object that is indexed the same (same size)
984981
as the original. This API allows you to provide *multiple* operations at the same
985982
time rather than one-by-one. Its api is quite similar to the ``.agg`` API.
986983

@@ -1034,8 +1031,8 @@ resulting column names will be the transforming functions.
10341031
tsdf.A.transform([np.abs, lambda x: x+1])
10351032
10361033
1037-
Transforming with a dict of functions
1038-
+++++++++++++++++++++++++++++++++++++
1034+
Transforming with a dict
1035+
++++++++++++++++++++++++
10391036

10401037

10411038
Passing a dict of functions will will allow selective transforming per column.
@@ -1051,14 +1048,6 @@ selective transforms.
10511048
10521049
tsdf.transform({'A': np.abs, 'B': [lambda x: x+1, 'sqrt']})
10531050
1054-
On a Series, passing a dict allows renaming as in ``.agg()``
1055-
1056-
.. ipython:: python
1057-
1058-
tsdf.A.transform({'foo': np.abs})
1059-
tsdf.A.transform({'foo': np.abs, 'bar': [lambda x: x+1, 'sqrt']})
1060-
1061-
10621051
.. _basics.elementwise:
10631052

10641053
Applying Elementwise Functions

doc/source/computation.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -644,10 +644,10 @@ columns if none are selected.
644644

645645
.. _stats.aggregate.multifunc:
646646

647-
Applying multiple functions at once
648-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
647+
Applying multiple functions
648+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
649649

650-
With windowed Series you can also pass a list or dict of functions to do
650+
With windowed ``Series`` you can also pass a list of functions to do
651651
aggregation with, outputting a DataFrame:
652652

653653
.. ipython:: python
@@ -668,7 +668,7 @@ Applying different functions to DataFrame columns
668668
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
669669

670670
By passing a dict to ``aggregate`` you can apply a different aggregation to the
671-
columns of a DataFrame:
671+
columns of a ``DataFrame``:
672672

673673
.. ipython:: python
674674
:okexcept:

doc/source/groupby.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -440,7 +440,7 @@ Aggregation
440440

441441
Once the GroupBy object has been created, several methods are available to
442442
perform a computation on the grouped data. These operations are similar to the
443-
:ref:`aggregating API <basics.aggregate>`, :ref:`window functions <stats.aggregate>`,
443+
:ref:`aggregating API <basics.aggregate>`, :ref:`window functions API <stats.aggregate>`,
444444
and :ref:`resample API <timeseries.aggregate>`.
445445

446446
An obvious one is aggregation via the ``aggregate`` or equivalently ``agg`` method:

doc/source/timeseries.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1524,7 +1524,7 @@ We can instead only resample those groups where we have points as follows:
15241524
Aggregation
15251525
~~~~~~~~~~~
15261526

1527-
Similar to the :ref:`aggregating API <basics.aggregate>`, :ref:`groupby aggregates <groupby.aggregate>`, and :ref:`window functions <stats.aggregate>`,
1527+
Similar to the :ref:`aggregating API <basics.aggregate>`, :ref:`groupby aggregates API <groupby.aggregate>`, and the :ref:`window functions API <stats.aggregate>`,
15281528
a ``Resampler`` can be selectively resampled.
15291529

15301530
Resampling a ``DataFrame``, the default will be to act on all columns with the same function.

pandas/core/base.py

+13-12
Original file line numberDiff line numberDiff line change
@@ -470,6 +470,15 @@ def _aggregate(self, arg, *args, **kwargs):
470470

471471
obj = self._selected_obj
472472

473+
def nested_renaming_depr(level=4):
474+
# deprecation of nested renaming
475+
# GH 15931
476+
warnings.warn(
477+
("using a dict with renaming "
478+
"is deprecated and will be removed in a future "
479+
"version"),
480+
FutureWarning, stacklevel=level)
481+
473482
# if we have a dict of any non-scalars
474483
# eg. {'A' : ['mean']}, normalize all to
475484
# be list-likes
@@ -498,14 +507,10 @@ def _aggregate(self, arg, *args, **kwargs):
498507
raise SpecificationError('cannot perform renaming '
499508
'for {0} with a nested '
500509
'dictionary'.format(k))
510+
nested_renaming_depr(4 + (_level or 0))
501511

502-
# deprecation of nested renaming
503-
# GH 15931
504-
warnings.warn(
505-
("using a dict with renaming "
506-
"is deprecated and will be removed in a future "
507-
"version"),
508-
FutureWarning, stacklevel=4)
512+
elif isinstance(obj, ABCSeries):
513+
nested_renaming_depr()
509514

510515
arg = new_arg
511516

@@ -515,11 +520,7 @@ def _aggregate(self, arg, *args, **kwargs):
515520
keys = list(compat.iterkeys(arg))
516521
if (isinstance(obj, ABCDataFrame) and
517522
len(obj.columns.intersection(keys)) != len(keys)):
518-
warnings.warn(
519-
("using a dict with renaming "
520-
"is deprecated and will be removed in a future "
521-
"version"),
522-
FutureWarning, stacklevel=4)
523+
nested_renaming_depr()
523524

524525
from pandas.tools.concat import concat
525526

pandas/tests/frame/test_apply.py

+9
Original file line numberDiff line numberDiff line change
@@ -563,6 +563,15 @@ def test_demo(self):
563563
index=['max', 'min', 'sum'])
564564
tm.assert_frame_equal(result.reindex_like(expected), expected)
565565

566+
def test_agg_dict_nested_renaming_depr(self):
567+
568+
df = pd.DataFrame({'A': range(5), 'B': 5})
569+
570+
# nested renaming
571+
with tm.assert_produces_warning(FutureWarning):
572+
df.agg({'A': {'foo': 'min'},
573+
'B': {'bar': 'max'}})
574+
566575
def test_agg_reduce(self):
567576
# all reducers
568577
expected = zip_frames(self.frame.mean().to_frame(),

pandas/tests/groupby/test_aggregate.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -310,12 +310,14 @@ def test_agg_dict_renaming_deprecation(self):
310310
'B': range(5),
311311
'C': range(5)})
312312

313-
with tm.assert_produces_warning(FutureWarning) as w:
313+
with tm.assert_produces_warning(FutureWarning,
314+
check_stacklevel=False) as w:
314315
df.groupby('A').agg({'B': {'foo': ['sum', 'max']},
315316
'C': {'bar': ['count', 'min']}})
316317
assert "using a dict with renaming" in str(w[0].message)
317318

318-
with tm.assert_produces_warning(FutureWarning):
319+
with tm.assert_produces_warning(FutureWarning,
320+
check_stacklevel=False):
319321
df.groupby('A')[['B', 'C']].agg({'ma': 'max'})
320322

321323
with tm.assert_produces_warning(FutureWarning) as w:

pandas/tests/groupby/test_value_counts.py

+1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
from pandas import MultiIndex, DataFrame, Series, date_range
88

99

10+
@pytest.mark.slow
1011
@pytest.mark.parametrize("n,m", product((100, 1000), (5, 20)))
1112
def test_series_groupby_value_counts(n, m):
1213
np.random.seed(1234)

pandas/tests/series/test_apply.py

+15-2
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,14 @@ def f(x):
139139
exp = pd.Series(['Asia/Tokyo'] * 25, name='XX')
140140
tm.assert_series_equal(result, exp)
141141

142+
def test_apply_dict_depr(self):
143+
144+
tsdf = pd.DataFrame(np.random.randn(10, 3),
145+
columns=['A', 'B', 'C'],
146+
index=pd.date_range('1/1/2000', periods=10))
147+
with tm.assert_produces_warning(FutureWarning):
148+
tsdf.A.agg({'foo': ['sum', 'mean']})
149+
142150

143151
class TestSeriesAggregate(TestData, tm.TestCase):
144152

@@ -225,7 +233,10 @@ def test_demo(self):
225233
expected = Series([0], index=['foo'], name='series')
226234
tm.assert_series_equal(result, expected)
227235

228-
result = s.agg({'foo': ['min', 'max']})
236+
# nested renaming
237+
with tm.assert_produces_warning(FutureWarning):
238+
result = s.agg({'foo': ['min', 'max']})
239+
229240
expected = DataFrame(
230241
{'foo': [0, 5]},
231242
index=['min', 'max']).unstack().rename('series')
@@ -234,7 +245,9 @@ def test_demo(self):
234245
def test_multiple_aggregators_with_dict_api(self):
235246

236247
s = Series(range(6), dtype='int64', name='series')
237-
result = s.agg({'foo': ['min', 'max'], 'bar': ['sum', 'mean']})
248+
# nested renaming
249+
with tm.assert_produces_warning(FutureWarning):
250+
result = s.agg({'foo': ['min', 'max'], 'bar': ['sum', 'mean']})
238251

239252
expected = DataFrame(
240253
{'foo': [5.0, np.nan, 0.0, np.nan],

0 commit comments

Comments
 (0)