Skip to content

CLN/DEPR: removed deprecated as_indexer arg from str.match() #22356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,7 @@ Removal of prior version deprecations/changes
- The ``LongPanel`` and ``WidePanel`` classes have been removed (:issue:`10892`)
- Several private functions were removed from the (non-public) module ``pandas.core.common`` (:issue:`22001`)
- Removal of the previously deprecated module ``pandas.core.datetools`` (:issue:`14105`, :issue:`14094`)
-
- Removal of the previously deprecated as_indexer keyword completely from ``str.match()`` (:issue:`22356`,:issue:`6581`)

.. _whatsnew_0240.performance:

Expand Down
20 changes: 3 additions & 17 deletions pandas/core/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -709,7 +709,7 @@ def rep(x, r):
return result


def str_match(arr, pat, case=True, flags=0, na=np.nan, as_indexer=None):
def str_match(arr, pat, case=True, flags=0, na=np.nan):
"""
Determine if each string matches a regular expression.

Expand All @@ -722,8 +722,6 @@ def str_match(arr, pat, case=True, flags=0, na=np.nan, as_indexer=None):
flags : int, default 0 (no flags)
re module flags, e.g. re.IGNORECASE
na : default NaN, fill value for missing values.
as_indexer
.. deprecated:: 0.21.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is this just wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyunTruth can you see when this was added? if it was in 0.21.0 then we can't remove this yet, if it was a mistake / typo then we could.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll check on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback It seems to be a typo, as in whatsnew v0.20.0, it is stated that

- The default behaviour of ``Series.str.match`` has changed from extracting
  groups to matching the pattern. The extracting behaviour was deprecated
  since pandas version 0.13.0 and can be done with the ``Series.str.extract``
  method (:issue:`5224`). As a consequence, the ``as_indexer`` keyword is
  ignored (no longer needed to specify the new behaviour) and is deprecated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok then


Returns
-------
Expand All @@ -741,17 +739,6 @@ def str_match(arr, pat, case=True, flags=0, na=np.nan, as_indexer=None):

regex = re.compile(pat, flags=flags)

if (as_indexer is False) and (regex.groups > 0):
raise ValueError("as_indexer=False with a pattern with groups is no "
"longer supported. Use '.str.extract(pat)' instead")
elif as_indexer is not None:
# Previously, this keyword was used for changing the default but
# deprecated behaviour. This keyword is now no longer needed.
warnings.warn("'as_indexer' keyword was specified but is ignored "
"(match now returns a boolean indexer by default), "
"and will be removed in a future version.",
FutureWarning, stacklevel=3)

dtype = bool
f = lambda x: bool(regex.match(x))

Expand Down Expand Up @@ -2469,9 +2456,8 @@ def contains(self, pat, case=True, flags=0, na=np.nan, regex=True):
return self._wrap_result(result)

@copy(str_match)
def match(self, pat, case=True, flags=0, na=np.nan, as_indexer=None):
result = str_match(self._parent, pat, case=case, flags=flags, na=na,
as_indexer=as_indexer)
def match(self, pat, case=True, flags=0, na=np.nan):
result = str_match(self._parent, pat, case=case, flags=flags, na=na)
return self._wrap_result(result)

@copy(str_replace)
Expand Down
15 changes: 0 additions & 15 deletions pandas/tests/test_strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -938,21 +938,6 @@ def test_match(self):
exp = Series([True, NA, False])
tm.assert_series_equal(result, exp)

# test passing as_indexer still works but is ignored
values = Series(['fooBAD__barBAD', NA, 'foo'])
exp = Series([True, NA, False])
with tm.assert_produces_warning(FutureWarning):
result = values.str.match('.*BAD[_]+.*BAD', as_indexer=True)
tm.assert_series_equal(result, exp)
with tm.assert_produces_warning(FutureWarning):
result = values.str.match('.*BAD[_]+.*BAD', as_indexer=False)
tm.assert_series_equal(result, exp)
with tm.assert_produces_warning(FutureWarning):
result = values.str.match('.*(BAD[_]+).*(BAD)', as_indexer=True)
tm.assert_series_equal(result, exp)
pytest.raises(ValueError, values.str.match, '.*(BAD[_]+).*(BAD)',
as_indexer=False)

# mixed
mixed = Series(['aBAD_BAD', NA, 'BAD_b_BAD', True, datetime.today(),
'foo', None, 1, 2.])
Expand Down