-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add cookbook entry for triangular correlation matrix (closes #22840) #23032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #23032 +/- ##
=========================================
Coverage ? 92.19%
=========================================
Files ? 169
Lines ? 50911
Branches ? 0
=========================================
Hits ? 46939
Misses ? 3972
Partials ? 0
Continue to review full report at Codecov.
|
|
||
.. ipython:: python | ||
|
||
df = pd.DataFrame(np.random.random(size=(100, 5))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make the size a lot smaller (like (4, 4) max)? corr_mat
would be huge otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think corr_mat
is only 5 x 5 here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, my mistake. Do you mind rendering cookbook.rst
and seeing what this section looks like? 5 x 5 should be relatively compact but just want to make sure the rendering isn't truncating any columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it's rendering well:
In [175]: df = pd.DataFrame(np.random.random(size=(100, 5)))
In [176]: corr_mat = df.corr()
In [177]: mask = np.tril(np.ones_like(corr_mat, dtype=np.bool), k=-1)
In [178]: corr_mat.where(mask)
Out[178]:
0 1 2 3 4
0 NaN NaN NaN NaN NaN
1 0.100443 NaN NaN NaN NaN
2 0.012441 -0.068965 NaN NaN NaN
3 0.009641 0.078722 -0.067531 NaN NaN
4 -0.065089 -0.156980 -0.004463 0.075126 NaN
doc/source/cookbook.rst
Outdated
@@ -1226,6 +1226,17 @@ Computation | |||
Correlation | |||
*********** | |||
|
|||
Often it's useful to obtain the lower (or upper) triangular form of a correlation matrix calculated from `DataFrame.corr`. This can be achieved by passing a boolean mask to `where` as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can add a func:
in front of DataFrame.corr
and double backticks around where.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@mroeschke Just did a rebase onto master, good to merge? |
thanks @dsaxton pls have a look at the rendered docs http://pandas-docs.github.io/pandas-docs-travis/ when built and if any issues pls comment / issue a PR to fix. |
NP, thank you! |
…xamples * repo_org/master: (66 commits) CLN: doc string (pandas-dev#23469) DOC: Add cookbook entry for triangular correlation matrix (GH22840) (pandas-dev#23032) add number of Errors, Warnings to scripts/validate_docstrings.py (pandas-dev#23150) BUG: Allow freq conversion from dt64 to period (pandas-dev#23460) ENH: Add FrozenList.union and .difference (pandas-dev#23394) REF: cython cleanup, typing, optimizations (pandas-dev#23464) strictness and checks for Timedelta _simple_new (pandas-dev#23433) Fixing flake8 problems new to flake8 3.6.0 (pandas-dev#23472) DOC: Updating the docstring of Series.dot (pandas-dev#22890) TST: Fixturize series/test_analytics.py (pandas-dev#22755) BUG/ENH: Handle NonexistentTimeError in date rounding (pandas-dev#23406) PERF: speed up concat on Series by making _get_axis_number() a classmethod (pandas-dev#23404) REF: Remove DatetimelikeArrayMixin._shallow_copy (pandas-dev#23430) REF: strictness/simplification in DatetimeArray/Index _simple_new (pandas-dev#23431) REF: cython cleanup, typing, optimizations (pandas-dev#23456) TST: tweak Hypothesis configuration and idioms (pandas-dev#23441) BUG: fix HDFStore.append with all empty strings error (GH12242) (pandas-dev#23435) TST: Skip 32bit failing IntervalTree tests (pandas-dev#23442) BUG: Deprecate nthreads argument (pandas-dev#23112) style: fix import format at pandas/core/reshape (pandas-dev#23387) ...
git diff upstream/master -u -- "*.py" | flake8 --diff