DOC/TST: test for deprecation in .agg

jreback · jreback · commit 83f0499ea2c3 · 2017-04-14T09:27:38.000-04:00
additional doc updates
diff --git a/doc/source/basics.rst b/doc/source/basics.rst
@@ -843,10 +843,11 @@ Aggregation API
 .. versionadded:: 0.20.0
 
 The aggregation API allows one to express possibly multiple aggregation operations in a single concise way.
-This API is similar across pandas objects, :ref:`groupby aggregates <groupby.aggregate>`,
-:ref:`window functions <stats.aggregate>`, and the :ref:`resample API <timeseries.aggregate>`.
+This API is similar across pandas objects, see :ref:`groupby API <groupby.aggregate>`, the
+:ref:`window functions API <stats.aggregate>`, and the :ref:`resample API <timeseries.aggregate>`.
+The entry point for aggregation is the method :meth:`~DataFrame.aggregate`, or the alias :meth:`~DataFrame.agg`.
 
-We will use a similar starting frame from above.
+We will use a similar starting frame from above:
 
 .. ipython:: python
 
@@ -855,8 +856,8 @@ We will use a similar starting frame from above.
    tsdf.iloc[3:7] = np.nan
    tsdf
 
-Using a single function is equivalent to ``.apply``; You can also pass named methods as strings.
-This will return a Series of the output.
+Using a single function is equivalent to :meth:`~DataFrame.apply`; You can also pass named methods as strings.
+These will return a ``Series`` of the aggregated output:
 
 .. ipython:: python
 
@@ -867,72 +868,68 @@ This will return a Series of the output.
    # these are equivalent to a ``.sum()`` because we are aggregating on a single function
    tsdf.sum()
 
-On a Series this will result in a scalar value
+Single aggregations on a ``Series`` this will result in a scalar value:
 
 .. ipython:: python
 
    tsdf.A.agg('sum')
 
 
-Aggregating multiple functions at once
-++++++++++++++++++++++++++++++++++++++
+Aggregating with multiple functions
++++++++++++++++++++++++++++++++++++
 
-You can pass arguments as a list. The results of each of the passed functions will be a row in the resultant DataFrame.
+You can pass multiple aggregation arguments as a list.
+The results of each of the passed functions will be a row in the resultant ``DataFrame``.
 These are naturally named from the aggregation function.
 
 .. ipython:: python
 
    tsdf.agg(['sum'])
 
-Multiple functions yield multiple rows.
+Multiple functions yield multiple rows:
 
 .. ipython:: python
 
    tsdf.agg(['sum', 'mean'])
 
-On a Series, multiple functions return a Series, indexed by the function names.
+On a ``Series``, multiple functions return a ``Series``, indexed by the function names:
 
 .. ipython:: python
 
    tsdf.A.agg(['sum', 'mean'])
 
-
-Aggregating with a dict of functions
-++++++++++++++++++++++++++++++++++++
-
-Passing a dictionary of column name to function or list of functions, to ``DataFame.agg``
-allows you to customize which functions are applied to which columns.
+Passing a ``lambda`` function will yield a ``<lambda>`` named row:
 
 .. ipython:: python
 
-   tsdf.agg({'A': 'mean', 'B': 'sum'})
+   tsdf.A.agg(['sum', lambda x: x.mean()])
 
-Passing a list-like will generate a DataFrame output. You will get a matrix-like output
-of all of the aggregators; some may be missing values.
+Passing a named function will yield that name for the row:
 
 .. ipython:: python
 
-   tsdf.agg({'A': ['mean', 'min'], 'B': 'sum'})
-
-For a Series, you can pass a dict. You will get back a MultiIndex Series; The outer level will
-be the keys, the inner the name of the functions.
+   def mymean(x):
+      return x.mean()
 
-.. ipython:: python
+   tsdf.A.agg(['sum', mymean])
 
-   tsdf.A.agg({'foo': ['sum', 'mean']})
+Aggregating with a dict
++++++++++++++++++++++++
 
-Alternatively, using multiple dictionaries, you can have renamed elements with the aggregation
+Passing a dictionary of column names to a scalar or a list of scalars, to ``DataFame.agg``
+allows you to customize which functions are applied to which columns.
 
 .. ipython:: python
 
-    tsdf.A.agg({'foo': 'sum', 'bar': 'mean'})
+   tsdf.agg({'A': 'mean', 'B': 'sum'})
 
-Multiple keys will yield a MultiIndex Series. The outer level will be the keys, the inner
-the names of the functions.
+Passing a list-like will generate a ``DataFrame`` output. You will get a matrix-like output
+of all of the aggregators. The output will consist of all unique functions. Those that are
+not noted for a particular column will be ``NaN``:
 
 .. ipython:: python
 
-    tsdf.A.agg({'foo': ['sum', 'mean'], 'bar': ['min', 'max', lambda x: x.sum()+1]})
+   tsdf.agg({'A': ['mean', 'min'], 'B': 'sum'})
 
 .. _basics.aggregation.mixed_dtypes:
 
@@ -980,7 +977,7 @@ Transform API
 
 .. versionadded:: 0.20.0
 
-The ``transform`` method returns an object that is indexed the same (same size)
+The :method:`~DataFrame.transform` method returns an object that is indexed the same (same size)
 as the original. This API allows you to provide *multiple* operations at the same
 time rather than one-by-one. Its api is quite similar to the ``.agg`` API.
 
@@ -1034,8 +1031,8 @@ resulting column names will be the transforming functions.
    tsdf.A.transform([np.abs, lambda x: x+1])
 
 
-Transforming with a dict of functions
-+++++++++++++++++++++++++++++++++++++
+Transforming with a dict
+++++++++++++++++++++++++
 
 
 Passing a dict of functions will will allow selective transforming per column.
@@ -1051,14 +1048,6 @@ selective transforms.
 
    tsdf.transform({'A': np.abs, 'B': [lambda x: x+1, 'sqrt']})
 
-On a Series, passing a dict allows renaming as in ``.agg()``
-
-.. ipython:: python
-
-   tsdf.A.transform({'foo': np.abs})
-   tsdf.A.transform({'foo': np.abs, 'bar': [lambda x: x+1, 'sqrt']})
-
-
 .. _basics.elementwise:
 
 Applying Elementwise Functions
diff --git a/doc/source/computation.rst b/doc/source/computation.rst
@@ -644,10 +644,10 @@ columns if none are selected.
 
 .. _stats.aggregate.multifunc:
 
-Applying multiple functions at once
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Applying multiple functions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-With windowed Series you can also pass a list or dict of functions to do
+With windowed ``Series`` you can also pass a list of functions to do
 aggregation with, outputting a DataFrame:
 
 .. ipython:: python
@@ -668,7 +668,7 @@ Applying different functions to DataFrame columns
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 By passing a dict to ``aggregate`` you can apply a different aggregation to the
-columns of a DataFrame:
+columns of a ``DataFrame``:
 
 .. ipython:: python
    :okexcept:
diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst
@@ -440,7 +440,7 @@ Aggregation
 
 Once the GroupBy object has been created, several methods are available to
 perform a computation on the grouped data. These operations are similar to the
-:ref:`aggregating API <basics.aggregate>`, :ref:`window functions <stats.aggregate>`,
+:ref:`aggregating API <basics.aggregate>`, :ref:`window functions API <stats.aggregate>`,
 and :ref:`resample API <timeseries.aggregate>`.
 
 An obvious one is aggregation via the ``aggregate`` or equivalently ``agg`` method:
diff --git a/doc/source/timeseries.rst b/doc/source/timeseries.rst
@@ -1524,7 +1524,7 @@ We can instead only resample those groups where we have points as follows:
 Aggregation
 ~~~~~~~~~~~
 
-Similar to the :ref:`aggregating API <basics.aggregate>`, :ref:`groupby aggregates <groupby.aggregate>`, and :ref:`window functions <stats.aggregate>`,
+Similar to the :ref:`aggregating API <basics.aggregate>`, :ref:`groupby aggregates API <groupby.aggregate>`, and  the :ref:`window functions API <stats.aggregate>`,
 a ``Resampler`` can be selectively resampled.
 
 Resampling a ``DataFrame``, the default will be to act on all columns with the same function.
diff --git a/pandas/core/base.py b/pandas/core/base.py
@@ -470,6 +470,15 @@ def _aggregate(self, arg, *args, **kwargs):
 
             obj = self._selected_obj
 
+            def nested_renaming_depr(level=4):
+                # deprecation of nested renaming
+                # GH 15931
+                warnings.warn(
+                    ("using a dict with renaming "
+                     "is deprecated and will be removed in a future "
+                     "version"),
+                    FutureWarning, stacklevel=level)
+
             # if we have a dict of any non-scalars
             # eg. {'A' : ['mean']}, normalize all to
             # be list-likes
@@ -498,14 +507,10 @@ def _aggregate(self, arg, *args, **kwargs):
                             raise SpecificationError('cannot perform renaming '
                                                      'for {0} with a nested '
                                                      'dictionary'.format(k))
+                        nested_renaming_depr(4 + (_level or 0))
 
-                        # deprecation of nested renaming
-                        # GH 15931
-                        warnings.warn(
-                            ("using a dict with renaming "
-                             "is deprecated and will be removed in a future "
-                             "version"),
-                            FutureWarning, stacklevel=4)
+                    elif isinstance(obj, ABCSeries):
+                        nested_renaming_depr()
 
                 arg = new_arg
 
@@ -515,11 +520,7 @@ def _aggregate(self, arg, *args, **kwargs):
                 keys = list(compat.iterkeys(arg))
                 if (isinstance(obj, ABCDataFrame) and
                         len(obj.columns.intersection(keys)) != len(keys)):
-                    warnings.warn(
-                        ("using a dict with renaming "
-                         "is deprecated and will be removed in a future "
-                         "version"),
-                        FutureWarning, stacklevel=4)
+                    nested_renaming_depr()
 
             from pandas.tools.concat import concat
 
diff --git a/pandas/tests/frame/test_apply.py b/pandas/tests/frame/test_apply.py
@@ -563,6 +563,15 @@ def test_demo(self):
                              index=['max', 'min', 'sum'])
         tm.assert_frame_equal(result.reindex_like(expected), expected)
 
+    def test_agg_dict_nested_renaming_depr(self):
+
+        df = pd.DataFrame({'A': range(5), 'B': 5})
+
+        # nested renaming
+        with tm.assert_produces_warning(FutureWarning):
+            df.agg({'A': {'foo': 'min'},
+                    'B': {'bar': 'max'}})
+
     def test_agg_reduce(self):
         # all reducers
         expected = zip_frames(self.frame.mean().to_frame(),
diff --git a/pandas/tests/groupby/test_aggregate.py b/pandas/tests/groupby/test_aggregate.py
@@ -310,12 +310,14 @@ def test_agg_dict_renaming_deprecation(self):
                            'B': range(5),
                            'C': range(5)})
 
-        with tm.assert_produces_warning(FutureWarning) as w:
+        with tm.assert_produces_warning(FutureWarning,
+                                        check_stacklevel=False) as w:
             df.groupby('A').agg({'B': {'foo': ['sum', 'max']},
                                  'C': {'bar': ['count', 'min']}})
             assert "using a dict with renaming" in str(w[0].message)
 
-        with tm.assert_produces_warning(FutureWarning):
+        with tm.assert_produces_warning(FutureWarning,
+                                        check_stacklevel=False):
             df.groupby('A')[['B', 'C']].agg({'ma': 'max'})
 
         with tm.assert_produces_warning(FutureWarning) as w:
diff --git a/pandas/tests/groupby/test_value_counts.py b/pandas/tests/groupby/test_value_counts.py
@@ -7,6 +7,7 @@
 from pandas import MultiIndex, DataFrame, Series, date_range
 
 
+@pytest.mark.slow
 @pytest.mark.parametrize("n,m", product((100, 1000), (5, 20)))
 def test_series_groupby_value_counts(n, m):
     np.random.seed(1234)
diff --git a/pandas/tests/series/test_apply.py b/pandas/tests/series/test_apply.py
@@ -139,6 +139,14 @@ def f(x):
         exp = pd.Series(['Asia/Tokyo'] * 25, name='XX')
         tm.assert_series_equal(result, exp)
 
+    def test_apply_dict_depr(self):
+
+        tsdf = pd.DataFrame(np.random.randn(10, 3),
+                            columns=['A', 'B', 'C'],
+                            index=pd.date_range('1/1/2000', periods=10))
+        with tm.assert_produces_warning(FutureWarning):
+            tsdf.A.agg({'foo': ['sum', 'mean']})
+
 
 class TestSeriesAggregate(TestData, tm.TestCase):
 
@@ -225,7 +233,10 @@ def test_demo(self):
         expected = Series([0], index=['foo'], name='series')
         tm.assert_series_equal(result, expected)
 
-        result = s.agg({'foo': ['min', 'max']})
+        # nested renaming
+        with tm.assert_produces_warning(FutureWarning):
+            result = s.agg({'foo': ['min', 'max']})
+
         expected = DataFrame(
             {'foo': [0, 5]},
             index=['min', 'max']).unstack().rename('series')
@@ -234,7 +245,9 @@ def test_demo(self):
     def test_multiple_aggregators_with_dict_api(self):
 
         s = Series(range(6), dtype='int64', name='series')
-        result = s.agg({'foo': ['min', 'max'], 'bar': ['sum', 'mean']})
+        # nested renaming
+        with tm.assert_produces_warning(FutureWarning):
+            result = s.agg({'foo': ['min', 'max'], 'bar': ['sum', 'mean']})
 
         expected = DataFrame(
             {'foo': [5.0, np.nan, 0.0, np.nan],