Closed
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
Assume there is a DataFrame and has a lot of columns.
I want to drop some of the columns.
>>> df = pd.DataFrame(np.array(([1, 2, 3], [4, 5, 6])),
... index=['mouse', 'rabbit'],
... columns=['one', 'two', 'three'])
>>> df
one two three
mouse 1 2 3
rabbit 4 5 6
# Want to drop the columns containing 'o'
>>> df.filter(like='o', complement=True)
three
mouse 3
rabbit 6
Feature Description
def filter(
self: NDFrameT,
items=None,
like: str | None = None,
regex: str | None = None,
axis=None,
complement: bool = False,
):
nkw = com.count_not_none(items, like, regex)
if nkw > 1:
raise TypeError(
"Keyword arguments `items`, `like`, or `regex` "
"are mutually exclusive"
)
if axis is None:
axis = self._info_axis_name
labels = self._get_axis(axis)
if items is not None:
def f(x) -> bool_t:
return (x not in labels) if complement else (x in labels)
name = self._get_axis_name(axis)
return self.reindex(**{name: [r for r in items if f(r)]})
elif like:
def f(x) -> bool_t:
assert like is not None # needed for mypy
return like in ensure_str(x)
elif regex:
matcher = re.compile(regex)
def f(x) -> bool_t:
return matcher.search(ensure_str(x)) is not None
else:
raise TypeError("Must pass either `items`, `like`, or `regex`")
values = labels.map(f)
if complement:
return self.loc(axis=axis)[~values]
return self.loc(axis=axis)[values]
Alternative Solutions
- For
DataFrame.drop
, it requires the full names of columns.
The following solutions could work but they may be a little bit annoying.
- Directly use
not in
, likedf.loc[:, df.columns.map(lambda x: 'xxx' not in x)]
. - It also could use
re
to search these columns and then drop them.
Additional Context
No response