-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the pandas.DataFrame.clip_lower docstring #20289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
d1bb286
22a256f
5312dd0
ba4c362
13c9d82
4e2e0d7
e48d49b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5716,24 +5716,81 @@ def clip_upper(self, threshold, axis=None, inplace=False): | |
|
||
def clip_lower(self, threshold, axis=None, inplace=False): | ||
""" | ||
Return copy of the input with values below given value(s) truncated. | ||
Return copy of the input with values below given value(s) trimmed. | ||
|
||
If an element value is below the threshold, the threshold value is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since threshold is an object here you can just say "value is below `threshold`..." (note backticks) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also changing the wording a little to match #20212 as per @datapythonista suggestion. |
||
returned instead. | ||
|
||
Parameters | ||
---------- | ||
threshold : float or array_like | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. array-like instead of array_like |
||
axis : int or string axis name, optional | ||
Lower value(s) to which the input value(s) will be trimmed. | ||
axis : {0 or 'index', 1 or 'columns'} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
Align object with threshold along the given axis. | ||
inplace : boolean, default False | ||
Whether to perform the operation in place on the data | ||
.. versionadded:: 0.21.0 | ||
.. versionadded:: 0.21.0. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No period after versionadded. Also, does this render? I think it needs to be down one more line and aligned with text directly above There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It rendered OK, but for consistency with other docstrings I am adding the extra line and shifting it to the left. Re the period, the validation script complains if it's not there. See discussion in #20227. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm OK. @datapythonista @jorisvandenbossche another issue of where I think the script is incorrectly calling out a missing period - we shouldn't be adding this to the end of a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the script is wrong when versionadded is present. Please ignore the error in the validation. |
||
|
||
See Also | ||
-------- | ||
clip | ||
DataFrame.clip: Trim values at input threshold(s). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Insert space before each colon |
||
DataFrame.clip_upper: Return copy of input with values above given | ||
value(s) trimmed. | ||
Series.clip_lower: Return copy of the input with values below given | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For comprehensiveness and consistency add the Series |
||
value(s) trimmed. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry if I forgot some discussion about it, but don't we want a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes good spot we definitely still want this - @adatasetaday can you add back in? I know we had some back and forth about |
||
Returns | ||
------- | ||
clipped : same type as input | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be nice to be explicit about return types here. I'd add the type ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see other docstrings say "type of caller" or "same type as caller". Should I use one of these for consistency or is the plan change all to the wording you provided? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question. I would at the very least put the type declaration on the first line (I think it is only Series or DataFrames that can call this, so if that's right just put |
||
|
||
Examples | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice job here - I think the examples are very clear |
||
-------- | ||
>>> df = pd.DataFrame({'a': [0.740518, 0.450228, 0.710404, -0.771225], | ||
... 'b': [0.040507, -0.45121, 0.760925, 0.010624]}) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I find the data used in #20212 easier to understand. It requires a decent amount of concentration to see what the examples are doing with so many decimal numbers. |
||
|
||
>>> df | ||
a b | ||
0 0.740518 0.040507 | ||
1 0.450228 -0.451210 | ||
2 0.710404 0.760925 | ||
3 -0.771225 0.010624 | ||
|
||
Clip to a scalar value | ||
|
||
>>> df.clip_lower(0.2) | ||
a b | ||
0 0.740518 0.200000 | ||
1 0.450228 0.200000 | ||
2 0.710404 0.760925 | ||
3 0.200000 0.200000 | ||
|
||
Clip to an array along the index axis | ||
|
||
>>> df.clip_lower([0.2, 0.4, 0.6, 0.8], axis=0) | ||
a b | ||
0 0.740518 0.200000 | ||
1 0.450228 0.400000 | ||
2 0.710404 0.760925 | ||
3 0.800000 0.800000 | ||
|
||
Clip to an array column the index axis | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe for consistency here you meant to say "Clip to an array along the column axis" |
||
|
||
>>> df.clip_lower([0.5, 0.0], axis=1) | ||
a b | ||
0 0.740518 0.040507 | ||
1 0.500000 0.000000 | ||
2 0.710404 0.760925 | ||
3 0.500000 0.010624 | ||
|
||
Clip in place | ||
|
||
>>> df.clip_lower(0.2, inplace=True) | ||
>>> df | ||
a b | ||
0 0.740518 0.200000 | ||
1 0.450228 0.200000 | ||
2 0.710404 0.760925 | ||
3 0.200000 0.200000 | ||
""" | ||
return self._clip_with_one_bound(threshold, method=self.ge, | ||
axis=axis, inplace=inplace) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use "caller" instead of "input". FWIW I guess it's not technically true that it returns a copy of the caller because
inplace
would modify directly. Any idea on how to make this statement more accurate?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something along the line of
Return a copy of the caller (or the caller itself if inplace == True) with values below given
threshold
trimmed.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm well I don't think you need to get in the discussion of return values since that's documented later in the docstring. First line is supposed to be very simple so I would mirror what's there for
clip
and say something likeTrim the caller's values below a given threshold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Since Series.clip_lower uses the same docstring (the function is defined in generic.py) I thought that I should change the "See Also" description for Series.clip_lower to be the same. But when I tested the generation of the HTML I realized that if we use the "See Also" as it is right now, for Series.clip_lower we will not have a reference to DataFrame.clip_lower and it will have a reference to itself.
Does this make sense? Is there an elegant way to solve it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that makes sense. Typically in this case you could use the
@Substitution
decorator - you'll see some other functions using in that same module.That said, since your method is currently just inherited by
Series
andDataFrame
and not necessarily overriden, I don't think there's a good way without overriding those implementation (if even just to call super) to implement Substitution. Unless you can think of a better way, I'd suggest you remove theclip_lower
reference in See Also and open up a separate issue to try and make substitution workThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I can also only think of overriding in order for it to work. But is it worth the effort just to make the "See Also" complete? Would a reference to just clip_lower (without DataFrame. or Series. prefixes) be a good enough option?
I'll sleep through it to see if I can think of something better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say just remove but if you come up with a better idea then by all means