-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the DataFrame.count docstring #20221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
83a7c88
8d76f60
fd11167
bbe96aa
dbb84eb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5592,22 +5592,68 @@ def corrwith(self, other, axis=0, drop=False): | |
|
||
def count(self, axis=0, level=None, numeric_only=False): | ||
""" | ||
Return Series with number of non-NA/null observations over requested | ||
axis. Works with non-floating point data as well (detects NaN and None) | ||
Count non-NA cells for each column or row. | ||
|
||
Return Series with number of non-NA observations over requested | ||
axis. Works with non-floating point data as well (detects `None`, | ||
`NaN` and `NaT`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. End with a |
||
|
||
Parameters | ||
---------- | ||
axis : {0 or 'index', 1 or 'columns'}, default 0 | ||
0 or 'index' for row-wise, 1 or 'columns' for column-wise | ||
level : int or level name, default None | ||
If the axis is a MultiIndex (hierarchical), count along a | ||
particular level, collapsing into a DataFrame | ||
If 0 or 'index' counts are generated for each column. | ||
If 1 or 'columns' counts are generated for each row. | ||
level : int or str, optional | ||
If the axis is a `MultiIndex` (hierarchical), count along a | ||
particular level, collapsing into a `DataFrame`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. backticks around the `level` parameter. |
||
A `str` specifies the level name. | ||
numeric_only : boolean, default False | ||
Include only float, int, boolean data | ||
Include only `float`, `int` or `boolean` data. | ||
|
||
Returns | ||
------- | ||
count : Series (or DataFrame if level specified) | ||
Series or DataFrame | ||
For each column/row the number of non-NA/null entries. | ||
If level is specified returns a `DataFrame`. | ||
|
||
See Also | ||
-------- | ||
Series.count: number of non-NA elements in a Series | ||
DataFrame.shape: number of DataFrame rows and columns (including NA | ||
elements) | ||
DataFrame.isnull: boolean same-sized DataFrame showing places of NA | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. refer to isna instead |
||
elements | ||
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame({"Person": | ||
... ["John", "Myla", None, "John", "Myla"], | ||
... "Age": [24., np.nan, 21., 33, 26], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PEP8: indendt one more space. smae with line below. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For me flake complains if I change that. on my system flake doesn't check the examples, so I copy it in the code:
If I have it like it like this flake only complains about the pd not being defined: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry I misread. |
||
... "Single": [False, True, True, True, False]}) | ||
>>> df | ||
Person Age Single | ||
0 John 24.0 False | ||
1 Myla NaN True | ||
2 None 21.0 True | ||
3 John 33.0 True | ||
4 Myla 26.0 False | ||
>>> df.count() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. blank line between cases |
||
Person 4 | ||
Age 4 | ||
Single 5 | ||
dtype: int64 | ||
>>> df.count(axis=1) | ||
0 3 | ||
1 2 | ||
2 2 | ||
3 3 | ||
4 3 | ||
dtype: int64 | ||
>>> df.set_index(["Person", "Single"]).count(level="Person") | ||
Age | ||
Person | ||
John 2 | ||
Myla 1 | ||
""" | ||
axis = self._get_axis_number(axis) | ||
if level is not None: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last change, maybe remove the first sentence since this can return a DataFrame with
level
.I think just use the extended summary to say what counts as non-null data.
The values
None
,NaN
,NaT
, and optionallynp.inf
(depending onpandas.options.mode.use_inf_as_na
) are considered NA.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the first sentence in the extended summary, i.e. :
"Return Series with number of non-NA observations over requested axis."
If I understand you right I would change the entire summary (i.e. short and extended summary) to look like the following:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. np.inf to `numpy.inf` and single backticks around pandas.options.mode.use_inf_as_na.