-
-
Notifications
You must be signed in to change notification settings - Fork 46.6k
add visualization of k means clustering as excel format #2104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hey @beqakd, TravisCI finished with status TravisBuddy Request Identifier: 9d0eb2c0-acac-11ea-8245-836d9da6cb4c |
Hey @beqakd, TravisCI finished with status TravisBuddy Request Identifier: 640c3620-acc8-11ea-8245-836d9da6cb4c |
Hey @beqakd, TravisCI finished with status TravisBuddy Request Identifier: e4967240-ace3-11ea-8245-836d9da6cb4c |
Error is not from my PR. Style changes fixed. |
machine_learning/k_means_clust.py
Outdated
@@ -202,3 +204,127 @@ def kmeans( | |||
verbose=True, | |||
) | |||
plot_heterogeneity(heterogeneity, k) | |||
|
|||
|
|||
def ReportGenerator(df, ClusteringVariables, FillMissingReport=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hints? Doctests? Function and variable names need to be snake_case
-- See CONTRIBUTING.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typehints okey. but i cant write doctests. It returns pandas dataframe can you suggest me some ideas how i can do it? We can test manually to be sure that it works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it impossible to examine various elements of a dataframe to ensure that those elements contain expected values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not but i wont be able to check all of them, but i will do good testing with different approaches. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few sanity checks will be good enough for our purposes. Thx.
machine_learning/k_means_clust.py
Outdated
""" | ||
Function generates easy-erading clustering report. It takes 2 arguments as an input: | ||
DataFrame - dataframe with predicted cluester column; | ||
FillMissingReport - dcitionary of rules how we are going to fill missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FillMissingReport - dcitionary of rules how we are going to fill missing | |
FillMissingReport - dictionary of rules how we are going to fill missing |
Hey @beqakd, TravisCI finished with status TravisBuddy Request Identifier: ccec7b60-b21c-11ea-b4e4-a33918451e6b |
Hey @beqakd, TravisCI finished with status TravisBuddy Request Identifier: ba40dc30-b21d-11ea-b4e4-a33918451e6b |
Co-authored-by: Christian Clauss <[email protected]>
Co-authored-by: Christian Clauss <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work here!
Thanks a lot! |
…s#2104) * add visualization of kmneas clust as excel format * style changes * style changes * Add doctest and typehint! * style change * Update machine_learning/k_means_clust.py Co-authored-by: Christian Clauss <[email protected]> * Update machine_learning/k_means_clust.py Co-authored-by: Christian Clauss <[email protected]> Co-authored-by: Christian Clauss <[email protected]>
Describe your change:
Add nice feature to convert Dataframe with clustering number in it to excel style format. It is easily readable and also has a lot of features in it to navigate through data.(like mean, mean_with_zero, median, max, min and etc.)
Checklist: