diff --git a/doc/source/basics.rst b/doc/source/basics.rst index 74b3dbb83ea91..0b3f2cca55518 100644 --- a/doc/source/basics.rst +++ b/doc/source/basics.rst @@ -226,11 +226,11 @@ We can also do elementwise :func:`divmod`: Missing data / operations with fill values ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In Series and DataFrame, the arithmetic functions have the option of inputting -a *fill_value*, namely a value to substitute when at most one of the values at -a location are missing. For example, when adding two DataFrame objects, you may -wish to treat NaN as 0 unless both DataFrames are missing that value, in which -case the result will be NaN (you can later replace NaN with some other value +In Series and DataFrame, the arithmetic functions have the option of inputting +a *fill_value*, namely a value to substitute when at most one of the values at +a location are missing. For example, when adding two DataFrame objects, you may +wish to treat NaN as 0 unless both DataFrames are missing that value, in which +case the result will be NaN (you can later replace NaN with some other value using ``fillna`` if you wish). .. ipython:: python @@ -260,8 +260,8 @@ arithmetic operations described above: df.gt(df2) df2.ne(df) -These operations produce a pandas object of the same type as the left-hand-side -input that is of dtype ``bool``. These ``boolean`` objects can be used in +These operations produce a pandas object of the same type as the left-hand-side +input that is of dtype ``bool``. These ``boolean`` objects can be used in indexing operations, see the section on :ref:`Boolean indexing`. .. _basics.reductions: @@ -452,7 +452,7 @@ So, for instance, to reproduce :meth:`~DataFrame.combine_first` as above: Descriptive statistics ---------------------- -There exists a large number of methods for computing descriptive statistics and +There exists a large number of methods for computing descriptive statistics and other related operations on :ref:`Series `, :ref:`DataFrame `, and :ref:`Panel `. Most of these are aggregations (hence producing a lower-dimensional result) like @@ -540,7 +540,7 @@ will exclude NAs on Series input by default: np.mean(df['one']) np.mean(df['one'].values) -:meth:`Series.nunique` will return the number of unique non-NA values in a +:meth:`Series.nunique` will return the number of unique non-NA values in a Series: .. ipython:: python @@ -852,7 +852,7 @@ Aggregation API The aggregation API allows one to express possibly multiple aggregation operations in a single concise way. This API is similar across pandas objects, see :ref:`groupby API `, the :ref:`window functions API `, and the :ref:`resample API `. -The entry point for aggregation is :meth:`DataFrame.aggregate`, or the alias +The entry point for aggregation is :meth:`DataFrame.aggregate`, or the alias :meth:`DataFrame.agg`. We will use a similar starting frame from above: @@ -864,8 +864,8 @@ We will use a similar starting frame from above: tsdf.iloc[3:7] = np.nan tsdf -Using a single function is equivalent to :meth:`~DataFrame.apply`. You can also -pass named methods as strings. These will return a ``Series`` of the aggregated +Using a single function is equivalent to :meth:`~DataFrame.apply`. You can also +pass named methods as strings. These will return a ``Series`` of the aggregated output: .. ipython:: python @@ -887,7 +887,7 @@ Single aggregations on a ``Series`` this will return a scalar value: Aggregating with multiple functions +++++++++++++++++++++++++++++++++++ -You can pass multiple aggregation arguments as a list. +You can pass multiple aggregation arguments as a list. The results of each of the passed functions will be a row in the resulting ``DataFrame``. These are naturally named from the aggregation function. @@ -1430,7 +1430,7 @@ Series can also be used: df.rename(columns={'one': 'foo', 'two': 'bar'}, index={'a': 'apple', 'b': 'banana', 'd': 'durian'}) -If the mapping doesn't include a column/index label, it isn't renamed. Note that +If the mapping doesn't include a column/index label, it isn't renamed. Note that extra labels in the mapping don't throw an error. .. versionadded:: 0.21.0 @@ -1740,19 +1740,26 @@ description. Sorting ------- -There are two obvious kinds of sorting that you may be interested in: sorting -by label and sorting by actual values. +Pandas supports three kinds of sorting: sorting by index labels, +sorting by column values, and sorting by a combination of both. + +.. _basics.sort_index: By Index ~~~~~~~~ -The primary method for sorting axis -labels (indexes) are the ``Series.sort_index()`` and the ``DataFrame.sort_index()`` methods. +The :meth:`Series.sort_index` and :meth:`DataFrame.sort_index` methods are +used to sort a pandas object by its index levels. .. ipython:: python + df = pd.DataFrame({'one' : pd.Series(np.random.randn(3), index=['a', 'b', 'c']), + 'two' : pd.Series(np.random.randn(4), index=['a', 'b', 'c', 'd']), + 'three' : pd.Series(np.random.randn(3), index=['b', 'c', 'd'])}) + unsorted_df = df.reindex(index=['a', 'd', 'c', 'b'], columns=['three', 'two', 'one']) + unsorted_df # DataFrame unsorted_df.sort_index() @@ -1762,20 +1769,22 @@ labels (indexes) are the ``Series.sort_index()`` and the ``DataFrame.sort_index( # Series unsorted_df['three'].sort_index() +.. _basics.sort_values: + By Values ~~~~~~~~~ -The :meth:`Series.sort_values` and :meth:`DataFrame.sort_values` are the entry points for **value** sorting (i.e. the values in a column or row). -:meth:`DataFrame.sort_values` can accept an optional ``by`` argument for ``axis=0`` -which will use an arbitrary vector or a column name of the DataFrame to -determine the sort order: +The :meth:`Series.sort_values` method is used to sort a `Series` by its values. The +:meth:`DataFrame.sort_values` method is used to sort a `DataFrame` by its column or row values. +The optional ``by`` parameter to :meth:`DataFrame.sort_values` may used to specify one or more columns +to use to determine the sorted order. .. ipython:: python df1 = pd.DataFrame({'one':[2,1,1,1],'two':[1,3,2,4],'three':[5,4,3,2]}) df1.sort_values(by='two') -The ``by`` argument can take a list of column names, e.g.: +The ``by`` parameter can take a list of column names, e.g.: .. ipython:: python @@ -1790,6 +1799,39 @@ argument: s.sort_values() s.sort_values(na_position='first') +.. _basics.sort_indexes_and_values: + +By Indexes and Values +~~~~~~~~~~~~~~~~~~~~~ + +.. versionadded:: 0.23.0 + +Strings passed as the ``by`` parameter to :meth:`DataFrame.sort_values` may +refer to either columns or index level names. + +.. ipython:: python + + # Build MultiIndex + idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 2), + ('b', 2), ('b', 1), ('b', 1)]) + idx.names = ['first', 'second'] + + # Build DataFrame + df_multi = pd.DataFrame({'A': np.arange(6, 0, -1)}, + index=idx) + df_multi + +Sort by 'second' (index) and 'A' (column) + +.. ipython:: python + + df_multi.sort_values(by=['second', 'A']) + +.. note:: + + If a string matches both a column name and an index level name then a + warning is issued and the column takes precedence. This will result in an + ambiguity error in a future version. .. _basics.searchsorted: @@ -1881,7 +1923,7 @@ The main types stored in pandas objects are ``float``, ``int``, ``bool``, ``int64`` and ``int32``. See :ref:`Series with TZ ` for more detail on ``datetime64[ns, tz]`` dtypes. -A convenient :attr:`~DataFrame.dtypes` attribute for DataFrame returns a Series +A convenient :attr:`~DataFrame.dtypes` attribute for DataFrame returns a Series with the data type of each column. .. ipython:: python @@ -1902,8 +1944,8 @@ On a ``Series`` object, use the :attr:`~Series.dtype` attribute. dft['A'].dtype -If a pandas object contains data with multiple dtypes *in a single column*, the -dtype of the column will be chosen to accommodate all of the data types +If a pandas object contains data with multiple dtypes *in a single column*, the +dtype of the column will be chosen to accommodate all of the data types (``object`` is the most general). .. ipython:: python @@ -1941,7 +1983,7 @@ defaults ~~~~~~~~ By default integer types are ``int64`` and float types are ``float64``, -*regardless* of platform (32-bit or 64-bit). +*regardless* of platform (32-bit or 64-bit). The following will all result in ``int64`` dtypes. .. ipython:: python diff --git a/doc/source/whatsnew/v0.23.0.txt b/doc/source/whatsnew/v0.23.0.txt index 5fd7c3e217928..42ea429aae1de 100644 --- a/doc/source/whatsnew/v0.23.0.txt +++ b/doc/source/whatsnew/v0.23.0.txt @@ -62,6 +62,32 @@ levels ` documentation section. left.merge(right, on=['key1', 'key2']) +.. _whatsnew_0230.enhancements.sort_by_columns_and_levels: + +Sorting by a combination of columns and index levels +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Strings passed to :meth:`DataFrame.sort_values` as the ``by`` parameter may +now refer to either column names or index level names. This enables sorting +``DataFrame`` instances by a combination of index levels and columns without +resetting indexes. See the :ref:`Sorting by Indexes and Values +` documentation section. +(:issue:`14353`) + +.. ipython:: python + + # Build MultiIndex + idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 2), + ('b', 2), ('b', 1), ('b', 1)]) + idx.names = ['first', 'second'] + + # Build DataFrame + df_multi = pd.DataFrame({'A': np.arange(6, 0, -1)}, + index=idx) + df_multi + + # Sort by 'second' (index) and 'A' (column) + df_multi.sort_values(by=['second', 'A']) .. _whatsnew_0230.enhancements.ran_inf: diff --git a/pandas/core/frame.py b/pandas/core/frame.py index 9acc82b50aabf..821db3c263885 100644 --- a/pandas/core/frame.py +++ b/pandas/core/frame.py @@ -113,7 +113,15 @@ axes_single_arg="{0 or 'index', 1 or 'columns'}", optional_by=""" by : str or list of str - Name or list of names which refer to the axis items.""", + Name or list of names to sort by. + + - if `axis` is 0 or `'index'` then `by` may contain index + levels and/or column labels + - if `axis` is 1 or `'columns'` then `by` may contain column + levels and/or index labels + + .. versionmodified:: 0.23.0 + Allow specifying index or column level names.""", versionadded_to_excel='', optional_labels="""labels : array-like, optional New labels / index to conform the axis specified by 'axis' to.""", @@ -3623,7 +3631,7 @@ def sort_values(self, by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last'): inplace = validate_bool_kwarg(inplace, 'inplace') axis = self._get_axis_number(axis) - other_axis = 0 if axis == 1 else 1 + stacklevel = 2 # Number of stack levels from df.sort_values if not isinstance(by, list): by = [by] @@ -3635,10 +3643,8 @@ def sort_values(self, by, axis=0, ascending=True, inplace=False, keys = [] for x in by: - k = self.xs(x, axis=other_axis).values - if k.ndim == 2: - raise ValueError('Cannot sort by duplicate column %s' % - str(x)) + k = self._get_label_or_level_values(x, axis=axis, + stacklevel=stacklevel) keys.append(k) indexer = lexsort_indexer(keys, orders=ascending, na_position=na_position) @@ -3647,17 +3653,9 @@ def sort_values(self, by, axis=0, ascending=True, inplace=False, from pandas.core.sorting import nargsort by = by[0] - k = self.xs(by, axis=other_axis).values - if k.ndim == 2: - - # try to be helpful - if isinstance(self.columns, MultiIndex): - raise ValueError('Cannot sort by column %s in a ' - 'multi-index you need to explicitly ' - 'provide all the levels' % str(by)) + k = self._get_label_or_level_values(by, axis=axis, + stacklevel=stacklevel) - raise ValueError('Cannot sort by duplicate column %s' % - str(by)) if isinstance(ascending, (tuple, list)): ascending = ascending[0] diff --git a/pandas/core/generic.py b/pandas/core/generic.py index 84799d12df0c4..09abe6d1faa38 100644 --- a/pandas/core/generic.py +++ b/pandas/core/generic.py @@ -69,7 +69,7 @@ args_transpose='axes to permute (int or label for object)', optional_by=""" by : str or list of str - Name or list of names which refer to the axis items.""") + Name or list of names to sort by""") def _single_replace(self, to_replace, method, inplace, limit): @@ -1156,7 +1156,7 @@ def _is_label_or_level_reference(self, key, axis=0): return (self._is_level_reference(key, axis=axis) or self._is_label_reference(key, axis=axis)) - def _check_label_or_level_ambiguity(self, key, axis=0): + def _check_label_or_level_ambiguity(self, key, axis=0, stacklevel=1): """ Check whether `key` matches both a level of the input `axis` and a label of the other axis and raise a ``FutureWarning`` if this is the @@ -1169,9 +1169,10 @@ def _check_label_or_level_ambiguity(self, key, axis=0): ---------- key: str or object label or level name - axis: int, default 0 Axis that levels are associated with (0 for index, 1 for columns) + stacklevel: int, default 1 + Stack level used when a FutureWarning is raised (see below). Returns ------- @@ -1216,12 +1217,12 @@ def _check_label_or_level_ambiguity(self, key, axis=0): label_article=label_article, label_type=label_type) - warnings.warn(msg, FutureWarning, stacklevel=2) + warnings.warn(msg, FutureWarning, stacklevel=stacklevel + 1) return True else: return False - def _get_label_or_level_values(self, key, axis=0): + def _get_label_or_level_values(self, key, axis=0, stacklevel=1): """ Return a 1-D array of values associated with `key`, a label or level from the given `axis`. @@ -1240,6 +1241,8 @@ def _get_label_or_level_values(self, key, axis=0): Label or level name. axis: int, default 0 Axis that levels are associated with (0 for index, 1 for columns) + stacklevel: int, default 1 + Stack level used when a FutureWarning is raised (see below). Returns ------- @@ -1251,6 +1254,9 @@ def _get_label_or_level_values(self, key, axis=0): if `key` matches neither a label nor a level ValueError if `key` matches multiple labels + FutureWarning + if `key` is ambiguous. This will become an ambiguity error in a + future version """ axis = self._get_axis_number(axis) @@ -1262,7 +1268,8 @@ def _get_label_or_level_values(self, key, axis=0): .format(type=type(self))) if self._is_label_reference(key, axis=axis): - self._check_label_or_level_ambiguity(key, axis=axis) + self._check_label_or_level_ambiguity(key, axis=axis, + stacklevel=stacklevel + 1) values = self.xs(key, axis=other_axes[0])._values elif self._is_level_reference(key, axis=axis): values = self.axes[axis].get_level_values(key)._values @@ -1271,11 +1278,22 @@ def _get_label_or_level_values(self, key, axis=0): # Check for duplicates if values.ndim > 1: + + if other_axes and isinstance( + self._get_axis(other_axes[0]), MultiIndex): + multi_message = ('\n' + 'For a multi-index, the label must be a ' + 'tuple with elements corresponding to ' + 'each level.') + else: + multi_message = '' + label_axis_name = 'column' if axis == 0 else 'index' raise ValueError(("The {label_axis_name} label '{key}' " - "is not unique") + "is not unique.{multi_message}") .format(key=key, - label_axis_name=label_axis_name)) + label_axis_name=label_axis_name, + multi_message=multi_message)) return values @@ -2956,7 +2974,7 @@ def add_suffix(self, suffix): Parameters ----------%(optional_by)s axis : %(axes_single_arg)s, default 0 - Axis to direct sorting + Axis to be sorted ascending : bool or list of bool, default True Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of diff --git a/pandas/core/groupby.py b/pandas/core/groupby.py index 285a347153a82..082b6e2a8b1a0 100644 --- a/pandas/core/groupby.py +++ b/pandas/core/groupby.py @@ -2972,7 +2972,9 @@ def is_in_obj(gpr): elif is_in_axis(gpr): # df.groupby('name') if gpr in obj: if validate: - obj._check_label_or_level_ambiguity(gpr) + stacklevel = 5 # Number of stack levels from df.groupby + obj._check_label_or_level_ambiguity( + gpr, stacklevel=stacklevel) in_axis, name, gpr = True, gpr, obj[gpr] exclusions.append(name) elif obj._is_level_reference(gpr): diff --git a/pandas/core/reshape/merge.py b/pandas/core/reshape/merge.py index 455c6f42ac74a..ad2a433b5632b 100644 --- a/pandas/core/reshape/merge.py +++ b/pandas/core/reshape/merge.py @@ -815,6 +815,7 @@ def _get_merge_keys(self): right_drop = [] left_drop = [] left, right = self.left, self.right + stacklevel = 5 # Number of stack levels from df.merge is_lkey = lambda x: isinstance( x, (np.ndarray, Series)) and len(x) == len(left) @@ -842,7 +843,8 @@ def _get_merge_keys(self): else: if rk is not None: right_keys.append( - right._get_label_or_level_values(rk)) + right._get_label_or_level_values( + rk, stacklevel=stacklevel)) join_names.append(rk) else: # work-around for merge_asof(right_index=True) @@ -852,7 +854,8 @@ def _get_merge_keys(self): if not is_rkey(rk): if rk is not None: right_keys.append( - right._get_label_or_level_values(rk)) + right._get_label_or_level_values( + rk, stacklevel=stacklevel)) else: # work-around for merge_asof(right_index=True) right_keys.append(right.index) @@ -865,7 +868,8 @@ def _get_merge_keys(self): else: right_keys.append(rk) if lk is not None: - left_keys.append(left._get_label_or_level_values(lk)) + left_keys.append(left._get_label_or_level_values( + lk, stacklevel=stacklevel)) join_names.append(lk) else: # work-around for merge_asof(left_index=True) @@ -877,7 +881,8 @@ def _get_merge_keys(self): left_keys.append(k) join_names.append(None) else: - left_keys.append(left._get_label_or_level_values(k)) + left_keys.append(left._get_label_or_level_values( + k, stacklevel=stacklevel)) join_names.append(k) if isinstance(self.right.index, MultiIndex): right_keys = [lev._values.take(lab) @@ -891,7 +896,8 @@ def _get_merge_keys(self): right_keys.append(k) join_names.append(None) else: - right_keys.append(right._get_label_or_level_values(k)) + right_keys.append(right._get_label_or_level_values( + k, stacklevel=stacklevel)) join_names.append(k) if isinstance(self.left.index, MultiIndex): left_keys = [lev._values.take(lab) diff --git a/pandas/tests/frame/test_sort_values_level_as_str.py b/pandas/tests/frame/test_sort_values_level_as_str.py new file mode 100644 index 0000000000000..3b4eadfce81cd --- /dev/null +++ b/pandas/tests/frame/test_sort_values_level_as_str.py @@ -0,0 +1,126 @@ +import numpy as np +import pytest + +from pandas import DataFrame, Index +from pandas.errors import PerformanceWarning +from pandas.util import testing as tm +from pandas.util.testing import assert_frame_equal + + +@pytest.fixture +def df_none(): + return DataFrame({ + 'outer': ['a', 'a', 'a', 'b', 'b', 'b'], + 'inner': [1, 2, 2, 2, 1, 1], + 'A': np.arange(6, 0, -1), + ('B', 5): ['one', 'one', 'two', 'two', 'one', 'one']}) + + +@pytest.fixture(params=[ + ['outer'], + ['outer', 'inner'] +]) +def df_idx(request, df_none): + levels = request.param + return df_none.set_index(levels) + + +@pytest.fixture(params=[ + 'inner', # index level + ['outer'], # list of index level + 'A', # column + [('B', 5)], # list of column + ['inner', 'outer'], # two index levels + [('B', 5), 'outer'], # index level and column + ['A', ('B', 5)], # Two columns + ['inner', 'outer'] # two index levels and column +]) +def sort_names(request): + return request.param + + +@pytest.fixture(params=[True, False]) +def ascending(request): + return request.param + + +def test_sort_index_level_and_column_label( + df_none, df_idx, sort_names, ascending): + + # GH 14353 + + # Get index levels from df_idx + levels = df_idx.index.names + + # Compute expected by sorting on columns and the setting index + expected = df_none.sort_values(by=sort_names, + ascending=ascending, + axis=0).set_index(levels) + + # Compute result sorting on mix on columns and index levels + result = df_idx.sort_values(by=sort_names, + ascending=ascending, + axis=0) + + assert_frame_equal(result, expected) + + +def test_sort_column_level_and_index_label( + df_none, df_idx, sort_names, ascending): + + # GH 14353 + + # Get levels from df_idx + levels = df_idx.index.names + + # Compute expected by sorting on axis=0, setting index levels, and then + # transposing. For some cases this will result in a frame with + # multiple column levels + expected = df_none.sort_values(by=sort_names, + ascending=ascending, + axis=0).set_index(levels).T + + # Compute result by transposing and sorting on axis=1. + result = df_idx.T.sort_values(by=sort_names, + ascending=ascending, + axis=1) + + if len(levels) > 1: + # Accessing multi-level columns that are not lexsorted raises a + # performance warning + with tm.assert_produces_warning(PerformanceWarning, + check_stacklevel=False): + assert_frame_equal(result, expected) + else: + assert_frame_equal(result, expected) + + +def test_sort_values_column_index_level_precedence(): + # GH 14353, when a string passed as the `by` parameter + # matches a column and an index level the column takes + # precedence + + # Construct DataFrame with index and column named 'idx' + idx = Index(np.arange(1, 7), name='idx') + df = DataFrame({'A': np.arange(11, 17), + 'idx': np.arange(6, 0, -1)}, + index=idx) + + # Sorting by 'idx' should sort by the idx column and raise a + # FutureWarning + with tm.assert_produces_warning(FutureWarning): + result = df.sort_values(by='idx') + + # This should be equivalent to sorting by the 'idx' index level in + # descending order + expected = df.sort_index(level='idx', ascending=False) + assert_frame_equal(result, expected) + + # Perform same test with MultiIndex + df_multi = df.set_index('A', append=True) + + with tm.assert_produces_warning(FutureWarning): + result = df_multi.sort_values(by='idx') + + expected = df_multi.sort_index(level='idx', ascending=False) + assert_frame_equal(result, expected) diff --git a/pandas/tests/frame/test_sorting.py b/pandas/tests/frame/test_sorting.py index a98439797dc28..5bd239f8a3034 100644 --- a/pandas/tests/frame/test_sorting.py +++ b/pandas/tests/frame/test_sorting.py @@ -455,26 +455,26 @@ def test_sort_index_duplicates(self): df = DataFrame([lrange(5, 9), lrange(4)], columns=['a', 'a', 'b', 'b']) - with tm.assert_raises_regex(ValueError, 'duplicate'): + with tm.assert_raises_regex(ValueError, 'not unique'): # use .sort_values #9816 with tm.assert_produces_warning(FutureWarning): df.sort_index(by='a') - with tm.assert_raises_regex(ValueError, 'duplicate'): + with tm.assert_raises_regex(ValueError, 'not unique'): df.sort_values(by='a') - with tm.assert_raises_regex(ValueError, 'duplicate'): + with tm.assert_raises_regex(ValueError, 'not unique'): # use .sort_values #9816 with tm.assert_produces_warning(FutureWarning): df.sort_index(by=['a']) - with tm.assert_raises_regex(ValueError, 'duplicate'): + with tm.assert_raises_regex(ValueError, 'not unique'): df.sort_values(by=['a']) - with tm.assert_raises_regex(ValueError, 'duplicate'): + with tm.assert_raises_regex(ValueError, 'not unique'): # use .sort_values #9816 with tm.assert_produces_warning(FutureWarning): # multi-column 'by' is separate codepath df.sort_index(by=['a', 'b']) - with tm.assert_raises_regex(ValueError, 'duplicate'): + with tm.assert_raises_regex(ValueError, 'not unique'): # multi-column 'by' is separate codepath df.sort_values(by=['a', 'b']) @@ -482,11 +482,11 @@ def test_sort_index_duplicates(self): # GH4370 df = DataFrame(np.random.randn(4, 2), columns=MultiIndex.from_tuples([('a', 0), ('a', 1)])) - with tm.assert_raises_regex(ValueError, 'levels'): + with tm.assert_raises_regex(ValueError, 'level'): # use .sort_values #9816 with tm.assert_produces_warning(FutureWarning): df.sort_index(by='a') - with tm.assert_raises_regex(ValueError, 'levels'): + with tm.assert_raises_regex(ValueError, 'level'): df.sort_values(by='a') # convert tuples to a list of tuples diff --git a/pandas/tests/generic/test_label_or_level_utils.py b/pandas/tests/generic/test_label_or_level_utils.py index 456cb48020500..1ad1b06aaefa2 100644 --- a/pandas/tests/generic/test_label_or_level_utils.py +++ b/pandas/tests/generic/test_label_or_level_utils.py @@ -175,8 +175,7 @@ def test_check_label_or_level_ambiguity_df(df_ambig, axis): # df_ambig has both an on-axis level and off-axis label named L1 # Therefore L1 is ambiguous with tm.assert_produces_warning(FutureWarning, - clear=True, - check_stacklevel=False) as w: + clear=True) as w: assert df_ambig._check_label_or_level_ambiguity('L1', axis=axis) warning_msg = w[0].message.args[0] @@ -245,7 +244,8 @@ def assert_label_values(frame, labels, axis): else: expected = frame.loc[label]._values - result = frame._get_label_or_level_values(label, axis=axis) + result = frame._get_label_or_level_values(label, axis=axis, + stacklevel=2) assert array_equivalent(expected, result) @@ -288,8 +288,7 @@ def test_get_label_or_level_values_df_ambig(df_ambig, axis): # df has both an on-axis level and off-axis label named L1 # Therefore L1 is ambiguous but will default to label - with tm.assert_produces_warning(FutureWarning, - check_stacklevel=False): + with tm.assert_produces_warning(FutureWarning): assert_label_values(df_ambig, ['L1'], axis=axis) # df has an on-axis level named L2 and it is not ambiguous diff --git a/pandas/tests/groupby/test_index_as_string.py b/pandas/tests/groupby/test_index_as_string.py index cee78eab3a636..9fe677664049e 100644 --- a/pandas/tests/groupby/test_index_as_string.py +++ b/pandas/tests/groupby/test_index_as_string.py @@ -99,7 +99,7 @@ def test_grouper_column_index_level_precedence(frame, frame['inner'] = [1, 1, 1, 1, 1, 1] # Performing a groupby with strings should produce warning - with tm.assert_produces_warning(FutureWarning, check_stacklevel=False): + with tm.assert_produces_warning(FutureWarning): result = frame.groupby(key_strs).mean() # Grouping with key Grouper should produce the same result and no warning diff --git a/pandas/tests/reshape/merge/test_merge_index_as_string.py b/pandas/tests/reshape/merge/test_merge_index_as_string.py index 4c638f8e441fa..09109e2692a24 100644 --- a/pandas/tests/reshape/merge/test_merge_index_as_string.py +++ b/pandas/tests/reshape/merge/test_merge_index_as_string.py @@ -200,14 +200,14 @@ def test_merge_index_column_precedence(df1, df2): # Merge left_df and right_df on 'outer' and 'inner' # 'outer' for left_df should refer to the 'outer' column, not the # 'outer' index level and a FutureWarning should be raised - with tm.assert_produces_warning(FutureWarning, check_stacklevel=False): + with tm.assert_produces_warning(FutureWarning): result = left_df.merge(right_df, on=['outer', 'inner']) # Check results assert_frame_equal(result, expected) # Perform the same using the left_on and right_on parameters - with tm.assert_produces_warning(FutureWarning, check_stacklevel=False): + with tm.assert_produces_warning(FutureWarning): result = left_df.merge(right_df, left_on=['outer', 'inner'], right_on=['outer', 'inner'])