Skip to content

Add cookbook entry for triangular correlation matrix (closes #22840) #23032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 3, 2018
Merged

Add cookbook entry for triangular correlation matrix (closes #22840) #23032

merged 1 commit into from
Nov 3, 2018

Conversation

dsaxton
Copy link
Member

@dsaxton dsaxton commented Oct 8, 2018

@codecov
Copy link

codecov bot commented Oct 8, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@e0e948d). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #23032   +/-   ##
=========================================
  Coverage          ?   92.19%           
=========================================
  Files             ?      169           
  Lines             ?    50911           
  Branches          ?        0           
=========================================
  Hits              ?    46939           
  Misses            ?     3972           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.61% <ø> (?)
#single 42.3% <ø> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e0e948d...6fe75a6. Read the comment docs.


.. ipython:: python

df = pd.DataFrame(np.random.random(size=(100, 5)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make the size a lot smaller (like (4, 4) max)? corr_mat would be huge otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think corr_mat is only 5 x 5 here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, my mistake. Do you mind rendering cookbook.rst and seeing what this section looks like? 5 x 5 should be relatively compact but just want to make sure the rendering isn't truncating any columns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's rendering well:

In [175]: df = pd.DataFrame(np.random.random(size=(100, 5)))

In [176]: corr_mat = df.corr()

In [177]: mask = np.tril(np.ones_like(corr_mat, dtype=np.bool), k=-1)

In [178]: corr_mat.where(mask)
Out[178]: 
          0         1         2         3   4
0       NaN       NaN       NaN       NaN NaN
1  0.100443       NaN       NaN       NaN NaN
2  0.012441 -0.068965       NaN       NaN NaN
3  0.009641  0.078722 -0.067531       NaN NaN
4 -0.065089 -0.156980 -0.004463  0.075126 NaN

@@ -1226,6 +1226,17 @@ Computation
Correlation
***********

Often it's useful to obtain the lower (or upper) triangular form of a correlation matrix calculated from `DataFrame.corr`. This can be achieved by passing a boolean mask to `where` as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can add a func: in front of DataFrame.corr and double backticks around where.

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dsaxton
Copy link
Member Author

dsaxton commented Nov 3, 2018

@mroeschke Just did a rebase onto master, good to merge?

@jreback jreback added this to the 0.24.0 milestone Nov 3, 2018
@jreback jreback merged commit 528ce15 into pandas-dev:master Nov 3, 2018
@jreback
Copy link
Contributor

jreback commented Nov 3, 2018

thanks @dsaxton

pls have a look at the rendered docs http://pandas-docs.github.io/pandas-docs-travis/ when built and if any issues pls comment / issue a PR to fix.

@dsaxton
Copy link
Member Author

dsaxton commented Nov 3, 2018

NP, thank you!

@dsaxton dsaxton deleted the corr-cb2 branch November 3, 2018 14:40
thoo added a commit to thoo/pandas that referenced this pull request Nov 3, 2018
…xamples

* repo_org/master: (66 commits)
  CLN: doc string (pandas-dev#23469)
  DOC: Add cookbook entry for triangular correlation matrix (GH22840) (pandas-dev#23032)
  add number of Errors, Warnings to scripts/validate_docstrings.py (pandas-dev#23150)
  BUG: Allow freq conversion from dt64 to period (pandas-dev#23460)
  ENH: Add FrozenList.union and .difference (pandas-dev#23394)
  REF: cython cleanup, typing, optimizations (pandas-dev#23464)
  strictness and checks for Timedelta _simple_new (pandas-dev#23433)
  Fixing flake8 problems new to flake8 3.6.0 (pandas-dev#23472)
  DOC: Updating the docstring of Series.dot  (pandas-dev#22890)
  TST: Fixturize series/test_analytics.py (pandas-dev#22755)
  BUG/ENH: Handle NonexistentTimeError in date rounding (pandas-dev#23406)
  PERF: speed up concat on Series by making _get_axis_number() a classmethod (pandas-dev#23404)
  REF: Remove DatetimelikeArrayMixin._shallow_copy (pandas-dev#23430)
  REF: strictness/simplification in DatetimeArray/Index _simple_new (pandas-dev#23431)
  REF: cython cleanup, typing, optimizations (pandas-dev#23456)
  TST: tweak Hypothesis configuration and idioms (pandas-dev#23441)
  BUG: fix HDFStore.append with all empty strings error (GH12242) (pandas-dev#23435)
  TST: Skip 32bit failing IntervalTree tests (pandas-dev#23442)
  BUG: Deprecate nthreads argument (pandas-dev#23112)
  style: fix import format at pandas/core/reshape (pandas-dev#23387)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Allow option to return lower triangular correlation matrix in DataFrame.corr
4 participants