From d691ba0db0df7d70a88099704f3e6481eecb17a9 Mon Sep 17 00:00:00 2001 From: Ben Nathanson Date: Sat, 23 May 2020 10:14:26 -0400 Subject: [PATCH] DOC: Tighten ecosystem->data science per #242 Proposed edit per @rgommers comment in #242: "Still would like to make this tab a little more compact." --- layouts/partials/data-science.html | 84 ++++++------------------------ 1 file changed, 15 insertions(+), 69 deletions(-) diff --git a/layouts/partials/data-science.html b/layouts/partials/data-science.html index af6bf16569..167cadf8f7 100644 --- a/layouts/partials/data-science.html +++ b/layouts/partials/data-science.html @@ -8,74 +8,20 @@

- NumPy lies at the core of a rich ecosystem of data science libraries. + NumPy lies at the core of a rich ecosystem of data science libraries:

-

- Data science is the analysis of massive amounts of data - to gain insight. A typical workflow might be: - -

-

-
- -
-
-

- Pandas helps in data discovery and handling, - Intake helps with - data access and distribution, while - Beautiful Soup - is widely used for web-scraping and gathering data sets. - Seaborn is well known for - exploratory data analysis (EDA); - scikit-learn and - SciPy (statistical computing) serve some - of the backbone processes required for machine learning (regression methods, - classification, clustering, model validation and selection). - Statistical data exploration, estimation of various statistical models, - and conducting statistical tests are some of the functions offered by - statsmodels. -

-
-
- Diagram of three overlapping circle. The circles labeled 'Mathematics', 'Computer Science' and 'Domain Expertise'. In the middle of the diagram, which has the three circles overlapping it, is an area labeled 'Data Science'. +
-
-

- Effective data analytics requires deep knowledge of the data domain (e.g., - retail, healthcare, marketing, finance, social media, automation, sales, travel, - etc.) as well as other core disciplines of data science, data engineering, and - data visualization. Tools such as MLFlow address - experiment hyperparameter and result tracking needs, while - DVC provides data version control for data science - and machine learning workflows. -

-