-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update NDFrame.squeeze docstring #20269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -699,18 +699,100 @@ def pop(self, item): | |
|
||
def squeeze(self, axis=None): | ||
""" | ||
Squeeze length 1 dimensions. | ||
Squeeze 1 dimensional axis objects into scalars. | ||
|
||
If the given axis consists of one dimensional objects, they are turned | ||
into scalars. In case no axis is specified, all axes are subject to | ||
squeezing. In any case, objects in axes that can't be squeezed are | ||
left unchanged. | ||
|
||
Parameters | ||
---------- | ||
axis : None, integer or string axis name, optional | ||
The axis to squeeze if 1-sized. | ||
axis : integer or string, optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Convention here is |
||
A specific axis to squeeze. | ||
|
||
.. versionadded:: 0.20.0 | ||
|
||
Returns | ||
------- | ||
scalar if 1-sized, else original object | ||
DataFrame, Series or scalar | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if scalar is what we use when it can return anything contained in the values of the dataframe. It's ok with me, just pointing out in case someone knows of another terminology. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I couldn't find anything on the guidelines about it, but I'm taking other docs as guide: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.sum.html |
||
The projection after squeezing axis. | ||
|
||
See Also | ||
-------- | ||
Series.iloc : Integer-location based indexing for selecting scalars | ||
DataFrame.iloc : Integer-location based indexing for selecting Series | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it make sense to add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is somewhat, I think squeezing is most useful in slicing scenarios but perhaps someone might find that a direct conversion is what they really wanted. Will add it too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hehe, that's a good reason to not add it... ;) not sure what I was thinking about, I think I got confused with |
||
|
||
Examples | ||
-------- | ||
>>> primes = pd.Series([2, 3, 5, 7]) | ||
|
||
Slicing might produce a Series with a single value: | ||
|
||
>>> even_primes = primes[primes % 2 == 0] | ||
>>> even_primes | ||
0 2 | ||
dtype: int64 | ||
>>> even_primes.squeeze() | ||
2 | ||
|
||
Squeezing objects with more than 1 dimension does nothing: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. technically speaking "squeezing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, "objects" became ambiguous. The objects inside that DataFrame axis would be 1D, that's what I meant. Will try to clarify. |
||
|
||
>>> odd_primes = primes[primes % 2 == 1] | ||
>>> odd_primes | ||
1 3 | ||
2 5 | ||
3 7 | ||
dtype: int64 | ||
>>> odd_primes.squeeze() | ||
1 3 | ||
2 5 | ||
3 7 | ||
dtype: int64 | ||
|
||
Squeezing is even more effective when used with DataFrames. | ||
|
||
>>> df = pd.DataFrame([[1,2], [3, 4]], columns=['a', 'b']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor edit - need space after first comma |
||
>>> df | ||
a b | ||
0 1 2 | ||
1 3 4 | ||
|
||
Slicing a single column will produce a DataFrame with one of the | ||
axis having only 1 dimension: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. axes instead of axis |
||
|
||
>>> df_a = df[['a']] | ||
>>> df_a | ||
a | ||
0 1 | ||
1 3 | ||
|
||
Objects along the column are 1 dimensional, so they can be squeezed | ||
into scalars: | ||
|
||
>>> df_a.squeeze('columns') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the first example is great but this one could use a little work. With how it's presented it wouldn't make sense to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense, in most cases squeeze will be necessary because of indirect slicing rather than specific axis selections. |
||
0 1 | ||
1 3 | ||
Name: a, dtype: int64 | ||
|
||
Slicing a single row from a single column will produce a single | ||
scalar DataFrame: | ||
|
||
>>> df_0a = df[['a']].iloc[[0]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Stylistically I think using a predicate than selecting a set of columns would be clearer, so something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, basically the same point as the one before. Will work on it. |
||
>>> df_0a | ||
a | ||
0 1 | ||
|
||
Squeezing along the rows produces a single scalar Series: | ||
|
||
>>> df_0a.squeeze('rows') | ||
a 1 | ||
Name: 0, dtype: int64 | ||
|
||
Squeezing all axes wil project directly into a scalar: | ||
|
||
>>> df_0a.squeeze() | ||
1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like this example using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted to have examples with both Series and DataFrames because both classes share this docstring, so it would be a bit weird to read the docs from I'll try to merge both, because the Series example is more concrete and related do slicing (the most likely use case IMO), but the second covers both classes. |
||
""" | ||
axis = (self._AXIS_NAMES if axis is None else | ||
(self._get_axis_number(axis),)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good description, but personally I find that being a bit less technical would make it easier to understand. For example "If applied in a 1x1 DataFrame, it returns the contained value. In a DataFrame with one column, it converts it to a Series...". Feel free to disagree, but IMO something like this is a bit faster to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hard part of describing squeeze it's that it's too general, but perhaps I lost touch with practicality because I was thinking of squeezing N-dimensional objects, even though pandas deals mainly with 1D (Series) and 2D (Frames).
I hoped to make it concrete in the examples, but I'm afraid people will just give up reading after a paragraph like this.