-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: add documentation to core.window.corr #20268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/window.py
Outdated
In the case of missing elements, only complete pairwise observations | ||
will be used.""") | ||
Calculate pairwise combinations of columns within a | ||
DataFrame. If other is not specified, defaults to True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put back ticks around parameters and built ins, so `other`, `True`, and `False` here.
pandas/core/window.py
Outdated
will be used.""") | ||
Calculate pairwise combinations of columns within a | ||
DataFrame. If other is not specified, defaults to True, | ||
otherwise defaults to False. Not relevant for Series. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:class:`~pandas.Series`
pandas/core/window.py
Outdated
|
||
Notes | ||
----- | ||
Other should be always be specified, except for DataFrame inputs with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`other` and :class:~`pandas.DataFrame`
pandas/core/window.py
Outdated
Function will return `NaN`s for correlations of equal valued sequences; | ||
this is the result of a 0/0 division error. | ||
|
||
When pairwise is set to `False`, only matching columns between self and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`self` and `other`
pandas/core/window.py
Outdated
%(name)s sample correlation | ||
Calculate %(name)s correlation. | ||
|
||
This function uses Pearson's definition of correlation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a link to Wikipedia or similar here?
pandas/core/window.py
Outdated
Notes | ||
----- | ||
Other should be always be specified, except for DataFrame inputs with | ||
pairwise set to `True`. All other input combinations will return all 1's. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pairwise" in single back-ticks
pandas/core/window.py
Outdated
Other should be always be specified, except for DataFrame inputs with | ||
pairwise set to `True`. All other input combinations will return all 1's. | ||
|
||
Function will return `NaN`s for correlations of equal valued sequences; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this. If the sequences are equally valued, like in the case of non specifying other
and pairwise=False
, the correlation of each column with itself should be all 1's. Am I wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true, but trivial. I'm rewording for clarity.
pandas/core/window.py
Outdated
other will be used. | ||
|
||
When pairwise is set to `True`, the output will be a MultiIndex DataFrame | ||
with the original index on the first level, and the "other" DataFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other in single back-ticks I believe
pandas/core/window.py
Outdated
with the original index on the first level, and the "other" DataFrame | ||
columns on the second level. | ||
|
||
In the case of missing elements, only complete pairwise observations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the correlation of "non-complete" elements will be set to NaN
? Can we write this in the explanation if so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct as currently implemented. I agree that this would be the desired behavior, but would require a separate pull request.
pandas/core/window.py
Outdated
>>> v1 = [3, 3, 3, 5, 8] | ||
>>> v2 = [3, 4, 4, 4, 8] | ||
>>> fmt = "{0:.6f}" # limit the printed precision to 6 digits | ||
>>> import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to import numpy, it's automatically imported for each docstring, see https://python-sprints.github.io/pandas/guide/pandas_docstring.html#conventions-for-the-examples
Hello @theandygross! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 08, 2018 at 14:31 Hours UTC |
pandas/core/window.py
Outdated
%(name)s sample correlation | ||
Calculate %(name)s correlation. | ||
|
||
This function uses Pearson's definition of correlation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move to Notes
pandas/core/window.py
Outdated
DataFrame. If `other` is not specified, defaults to `True`, | ||
otherwise defaults to `False`. Not relevant for :class:`~pandas.Series`. | ||
See notes. | ||
**kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we remove this. @TomAugspurger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is fine.
For the explanation, you can put "For compatibility with other %(name)s methods. Not used."
Codecov Report
@@ Coverage Diff @@
## master #20268 +/- ##
=========================================
Coverage ? 91.84%
=========================================
Files ? 153
Lines ? 49275
Branches ? 0
=========================================
Hits ? 45255
Misses ? 4020
Partials ? 0
Continue to review full report at Codecov.
|
Thanks @theandygross ! |
Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):
scripts/validate_docstrings.py <your-function-or-method>
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single <your-function-or-method>
Please include the output of the validation script below between the "```" ticks:
If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.
Checklist for other PRs (remove this part if you are doing a PR for the pandas documentation sprint):
git diff upstream/master -u -- "*.py" | flake8 --diff