ENH: Add dropna argument to pd.DataFrame.value_counts() #41325

connesy · 2021-05-05T09:00:03Z

Is your feature request related to a problem?

With pd.Series.value_counts() it is possible to specify dropna=False, but that argument does not exist in pd.DataFrame.value_counts(). As a consequence, all rows that contain at least one NA element is dropped when using df.value_counts().

Describe the solution you'd like

It should be possible to call df.value_counts() with dropna=False and get a count for each unique row, including rows that have NAs in them.

API breaking implications

Like with pd.Series.value_counts() the default should be dropna=True. This will keep consistency between the two implementations, and leave current behavior unchanged.

Describe alternatives you've considered

Additional context

>>> import pandas as pd
>>> s1 = pd.Series([1, 2, 3, pd.NA, 3])
>>> s2 = pd.Series([pd.NA, 1, pd.NA, 4, 2])
>>> s1.value_counts(dropna=False)
3.0    2
NaN    1
1.0    1
2.0    1
dtype: int64
>>> df = pd.DataFrame(zip(s1, s2), columns=['s1', 's2'])
>>> df
     s1    s2
0     1  <NA>
1     2     1
2     3  <NA>
3  <NA>     4
4     3     2
>>> df.value_counts()
s1  s2
2   1     1
3   2     1
dtype: int64

The text was updated successfully, but these errors were encountered:

connesy added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels May 5, 2021

connesy mentioned this issue May 5, 2021

ENH: Add dropna argument to DataFrame.value_counts() #41334

Merged

4 tasks

lithomas1 added API - Consistency Internal Consistency of API/Behavior Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 5, 2021

jreback added this to the 1.3 milestone May 10, 2021

jreback closed this as completed in #41334 May 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add dropna argument to pd.DataFrame.value_counts() #41325

ENH: Add dropna argument to pd.DataFrame.value_counts() #41325

connesy commented May 5, 2021

ENH: Add dropna argument to pd.DataFrame.value_counts() #41325

ENH: Add dropna argument to pd.DataFrame.value_counts() #41325

Comments

connesy commented May 5, 2021

Is your feature request related to a problem?

Describe the solution you'd like

API breaking implications

Describe alternatives you've considered

Additional context