Add index feature #26

GloriaWYY · 2022-07-19T02:24:10Z

Due to difference in the functions used in Python and R, I need to modify several index names. e.g. tidymodels has to be changed to scikit-learn. And there are some cases when there is no one-to-one match for functions, so some index entries are combined into a single entry.

Below, I will document the index entries that I have changed for easier tracking:

classification1

glimpse -> info
DELETE: \index{factor!as_factor}, \index{levels}\index{factor!levels}
group_by -> groupby, summarize -> count
mutate -> assign
\index{tidymodels}\index{parsnip} -> scikit-learn
tidymodels!model specification -> scikit-learn; model instance
tidymodels!engine -> scikit-learn; KNeighborsClassifier
tidymodels!model formula -> scikit-learn; X & y
tidymodels!predict -> scikit-learn; predict
recipe -> pipeline
recipe!step_scale, recipe!step_center -> scikit-learn; StandardScaler
ADD: scikit-learn; ColumnTransformer
tidymodels!prep -> scikit-learn; fit
tidymodels!bake -> scikit-learn; transform
DELETE: \index{recipe!all_predictors}
recipe!step_upsample -> scikit-learn; resample
DELETE: tidymodels!add_recipe, tidymodels!add_model

classification 2

\index{seed!set.seed} -> seed; numpy.random.seed
\index{sample!function} -> sample; numpy.random.choice
tidymodels -> scikit-learn
\index{tidymodels!initial_split} -> scikit-learn; train_test_split
glimpse -> info
\index{tidymodels!vfold_cv}\index{cross-validation!vfold_cv} -> cross-validation; cross_validate, scikit-learn; cross_validate
DELETE tidymodels!fit_resamples
DELETE \index{tidymodels!collect_metrics}\index{cross-validation!collect_metrics}

regression 1

DELETE \index{seed!set.seed}
\index{ggplot!geom_point} -> altair; mark_circle
\index{slice_sample} -> pandas.DataFrame; sample
\index{mutate}\index{slice}\index{arrange}\index{abs} -> pandas.DataFrame; assign, head, pandas.DataFrame; sort_values, abs
\index{tidymodels}\index{recipe}\index{workflow} -> scikit-learn, scikit-learn; pipeline, scikit-learn; make_pipeline, scikit-learn; make_column_transformer
DELETE \index{cross-validation!collect_metrics} -> ADD scikit-learn; GridSearchCV

regression 2

tidymodels -> scikit-learn
\index{seed!set.seed} -> scikit-learn; random_state
\index{regression!multivariable linear}\index{regression!multivariable linear equation|see{plane equation}} ->
regression; multivariable linear, regression; multivariable linear equation
see: multivariable linear equation; plane equation

inference

DELETE \index{seed!set.seed}
\index{pull}\index{sum}\index{nrow} -> pandas.DataFrame; df[], count, len
\index{rep_sample_n} -> pandas.DataFrame; sample
DELETE \index{infer}
DELETE \index{rep_sample_n!reps argument}\index{rep_sample_n!size argument}
\index{bootstrap!in R}\index{rep_sample_n!bootstrap} -> bootstrap; in Python, scikit-learn; resample (bootstrap)
\index{quantile} -> numpy; percentile
\index{pull}\index{select} -> pandas.DataFrame; df[]

wrangling

ADD pandas; data frame,
DELETE \index{vector}\index{atomic vector|see{vector}}\index{c function}
\index{data types}\index{character}\index{chr|see{character}}\index{integer}\index{int|see{integer}}\index{double}\index{dbl|see{double}}\index{logical}\index{lgl|see{logical}}\index{factor}\index{fct|see{factor}} -> data types, string, integer, floating point number, boolean, list, set, dictionary, tuple, none
class -> type
DELETE \index{tibble}
\index{pivot_longer} -> pandas.DataFrame; melt
\index{pivot_wider} -> pandas.DataFrame; pivot
\index{separate} -> pandas.Series; str.split
DELETE \index{select!helpers}
\index{select!starts_with} -> pandas.Series; str.startswith
\index{select!contains} -> pandas.Series; str.contains
\index{mutate} -> pandas.DataFrame; df[]
DELETE \index{pipe}\index{aaapipesymb@\vert{}>|see{pipe}}
ADD chaining methods
\index{NA|see{missing data}} -> see: NaN; missing data
\index{group_by} -> pandas.DataFrame; groupby
DELETE \index{across} \index{map} \index{map!map_* functions}
ADD pandas.DataFrame; apply
DELETE \index{rowwise}

intro

\index{library} -> import
\index{tidyverse} -> pandas
\index{filter}\index{select} -> pandas.DataFrame; df[], pandas.DataFrame; loc[]
\index{arrange}\index{slice} -> pandas.DataFrame; sort_values, pandas.DataFrame; iloc[]
\index{ggplot} -> altair
\index{aaaplussymb@$+$|see{ggplot (add layer)}} -> see: .; chaining methods
\index{plot!layers} -> plot; labels
\index{reorder} -> altair; sort
\index{aaaquestionmark@?|see{documentation}}\index{help|see{documentation}}\index{documentation} ->
documentation
see: help; documentation
see: doc; documentation
tidyverse -> pandas
\index{warning} -> Error
\index{read function!skip argument} -> read function; skiprows argument
\index{read function!delim argument} -> read function; sep argument
\index{rename} -> pandas.DataFrame; rename
\index{read function!col_names argument} -> read function; names argument
DELETE \index{readxl}
Add SQLAlchemy, SQLAlchemy; create_engine, database; SQLAlchemy
\index{database!tbl} -> database; select, SQLAlchemy; select
\index{database!collect} -> database; fetchall, SQLAlchemy; fetchall
\index{database!show_query} -> database; show query, SQLAlchemy; query.compile
filter -> database; filter data, SQLAlchemy; where
\index{nrow} -> pandas.DataFrame; shape
\index{tail} -> pandas.DataFrame; tail
\index{write function!write_csv} -> write function; to_csv, pandas.DataFrame; to_csv

clustering

\index{seed!set.seed} -> seed; numpy.random.seed
DELETE \index{mutate}
\index{ggplot}\index{ggplot!geom_point} -> altair; altair, mark_circle
ADD scikit-learn; KMeans
DELETE \index{broom}\index{broom}\index{augment}
ADD K-means; inertia_, K-means; cluster_centers_, K-means; labels_, K-means; predict
\index{K-means!restart, nstart} -> K-means; init argument
\index{WSSD!total} -> WSSD; total, K-means; inertia_
see: WSSD; K-means inertia
mutate -> pandas.DataFrame; assign
DELETE \index{rowwise}\index{glance}
ADD pandas.DataFrame; iloc[]

viz

ggplot -> altair
\index{ggplot!aesthetic mapping}\index{ggplot!geometric object} -> altair; geometric object, altair; geometric encoding, geometric object, geometric encoding
DELETE \index{ggplot!aes}\index{ggplot!geom_point}
\index{ggplot!geom_line} -> altair; mark_line
\index{ggplot!xlab,ylab}\index{ggplot!theme} -> altair; alt.X, altair; alt.Y, altair; configure_axis
\index{ggplot!scales} -> altair; alt.Scale
\index{ggplot!geom_point} -> altair; mark_circle
filter -> pandas.DataFrame; loc[]
\index{ggplot!logarithmic scaling} -> altair; logarithmic scaling
\index{mutate}\index{select} -> pandas.DataFrame; assign, pandas.DataFrame; [[]]
DELETE \index{color palette}
\index{ggplot!reorder} -> altair; sort
\index{ggplot!geom_vline} -> altair; mark_rule
\index{factor}\index{factor!usage in ggplot} -> nominal, altair; :N
\index{ggplot!facet_grid} -> altair; facet
\index{ggplot!add layer} -> altair; +

The text was updated successfully, but these errors were encountered:

GloriaWYY · 2022-07-28T23:30:00Z

Changes in citations

classification 1

[@parsnip][@recipes] -> {cite:p}scikit-learn

classification 2

[@wickham2016r] -> {cite:p}mckinney2012python

regression 2

Need to replace Modern Dive [@moderndive]; see issue regression2 #20

wrangling

DELETE [@dplyr] [@tidyselect] [@wickham2016r]
ADD {cite:p}mckinney2012python

intro

[@tidyverse; @wickham2019tidverse] -> {cite:p}reback2020pandas,mckinney-proc-scipy-2010
[@tidyversestyleguide] -> {cite:p}pep8-style-guide

reading

The content from this section on is not edited (still in R version), since Trevor said the instructor team will decide whether these will be kept or not in later stages.
[@wickham2016r] -> {cite:p}mckinney2012python
DELETE [readr documentation] [@here] [readxl documentation][@rio]
ADD pandas documentation

viz

[@ggplot] -> {cite:p}altair
DELETE [@wickham2016r]
ADD {cite:p}mckinney2012python

trevorcampbell · 2022-08-10T19:37:52Z

This issue is done in PR #30 -- but I'll leave it open for now as a checklist for when we review/edit in detail.

trevorcampbell added the 1st edition Planned for inclusion in 1st print edition label Sep 17, 2023

trevorcampbell mentioned this issue Nov 16, 2023

Index Update #316

Merged

trevorcampbell closed this as completed in #316 Nov 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add index feature #26

Add index feature #26

GloriaWYY commented Jul 19, 2022 •

edited

Loading

GloriaWYY commented Jul 28, 2022

trevorcampbell commented Aug 10, 2022

Add index feature #26

Add index feature #26

Comments

GloriaWYY commented Jul 19, 2022 • edited Loading

classification1

classification 2

regression 1

regression 2

inference

wrangling

intro

clustering

viz

GloriaWYY commented Jul 28, 2022

Changes in citations

classification 1

classification 2

regression 2

wrangling

intro

reading

viz

trevorcampbell commented Aug 10, 2022

GloriaWYY commented Jul 19, 2022 •

edited

Loading