diff --git a/doc/source/getting_started/comparison/comparison_with_sas.rst b/doc/source/getting_started/comparison/comparison_with_sas.rst
index eb11b75027909..b97efe31b8b29 100644
--- a/doc/source/getting_started/comparison/comparison_with_sas.rst
+++ b/doc/source/getting_started/comparison/comparison_with_sas.rst
@@ -308,8 +308,8 @@ Sorting in SAS is accomplished via ``PROC SORT``
 String processing
 -----------------
 
-Length
-~~~~~~
+Finding length of string
+~~~~~~~~~~~~~~~~~~~~~~~~
 
 SAS determines the length of a character string with the
 `LENGTHN <https://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002284668.htm>`__
@@ -327,8 +327,8 @@ functions. ``LENGTHN`` excludes trailing blanks and ``LENGTHC`` includes trailin
 .. include:: includes/length.rst
 
 
-Find
-~~~~
+Finding position of substring
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 SAS determines the position of a character in a string with the
 `FINDW <https://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002978282.htm>`__ function.
@@ -342,19 +342,11 @@ you supply as the second argument.
    put(FINDW(sex,'ale'));
    run;
 
-Python determines the position of a character in a string with the
-``find`` function.  ``find`` searches for the first position of the
-substring.  If the substring is found, the function returns its
-position.  Keep in mind that Python indexes are zero-based and
-the function will return -1 if it fails to find the substring.
-
-.. ipython:: python
-
-   tips["sex"].str.find("ale").head()
+.. include:: includes/find_substring.rst
 
 
-Substring
-~~~~~~~~~
+Extracting substring by position
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 SAS extracts a substring from a string based on its position with the
 `SUBSTR <https://www2.sas.com/proceedings/sugi25/25/cc/25p088.pdf>`__ function.
@@ -366,17 +358,11 @@ SAS extracts a substring from a string based on its position with the
    put(substr(sex,1,1));
    run;
 
-With pandas you can use ``[]`` notation to extract a substring
-from a string by position locations.  Keep in mind that Python
-indexes are zero-based.
+.. include:: includes/extract_substring.rst
 
-.. ipython:: python
 
-   tips["sex"].str[0:1].head()
-
-
-Scan
-~~~~
+Extracting nth word
+~~~~~~~~~~~~~~~~~~~
 
 The SAS `SCAN <https://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000214639.htm>`__
 function returns the nth word from a string. The first argument is the string you want to parse and the
@@ -394,20 +380,11 @@ second argument specifies which word you want to extract.
    ;;;
    run;
 
-Python extracts a substring from a string based on its text
-by using regular expressions. There are much more powerful
-approaches, but this just shows a simple approach.
-
-.. ipython:: python
-
-   firstlast = pd.DataFrame({"String": ["John Smith", "Jane Cook"]})
-   firstlast["First_Name"] = firstlast["String"].str.split(" ", expand=True)[0]
-   firstlast["Last_Name"] = firstlast["String"].str.rsplit(" ", expand=True)[0]
-   firstlast
+.. include:: includes/nth_word.rst
 
 
-Upcase, lowcase, and propcase
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Changing case
+~~~~~~~~~~~~~
 
 The SAS `UPCASE <https://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245965.htm>`__
 `LOWCASE <https://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245912.htm>`__ and
@@ -427,27 +404,13 @@ functions change the case of the argument.
    ;;;
    run;
 
-The equivalent Python functions are ``upper``, ``lower``, and ``title``.
+.. include:: includes/case.rst
 
-.. ipython:: python
-
-   firstlast = pd.DataFrame({"String": ["John Smith", "Jane Cook"]})
-   firstlast["string_up"] = firstlast["String"].str.upper()
-   firstlast["string_low"] = firstlast["String"].str.lower()
-   firstlast["string_prop"] = firstlast["String"].str.title()
-   firstlast
 
 Merging
 -------
 
-The following tables will be used in the merge examples
-
-.. ipython:: python
-
-   df1 = pd.DataFrame({"key": ["A", "B", "C", "D"], "value": np.random.randn(4)})
-   df1
-   df2 = pd.DataFrame({"key": ["B", "D", "D", "E"], "value": np.random.randn(4)})
-   df2
+.. include:: includes/merge_setup.rst
 
 In SAS, data must be explicitly sorted before merging.  Different
 types of joins are accomplished using the ``in=`` dummy
@@ -473,39 +436,13 @@ input frames.
        if a or b then output outer_join;
    run;
 
-pandas DataFrames have a :meth:`~DataFrame.merge` method, which provides
-similar functionality.  Note that the data does not have
-to be sorted ahead of time, and different join
-types are accomplished via the ``how`` keyword.
-
-.. ipython:: python
-
-   inner_join = df1.merge(df2, on=["key"], how="inner")
-   inner_join
-
-   left_join = df1.merge(df2, on=["key"], how="left")
-   left_join
-
-   right_join = df1.merge(df2, on=["key"], how="right")
-   right_join
-
-   outer_join = df1.merge(df2, on=["key"], how="outer")
-   outer_join
+.. include:: includes/merge.rst
 
 
 Missing data
 ------------
 
-Like SAS, pandas has a representation for missing data - which is the
-special float value ``NaN`` (not a number).  Many of the semantics
-are the same, for example missing data propagates through numeric
-operations, and is ignored by default for aggregations.
-
-.. ipython:: python
-
-   outer_join
-   outer_join["value_x"] + outer_join["value_y"]
-   outer_join["value_x"].sum()
+.. include:: includes/missing_intro.rst
 
 One difference is that missing data cannot be compared to its sentinel value.
 For example, in SAS you could do this to filter missing values.
@@ -522,25 +459,7 @@ For example, in SAS you could do this to filter missing values.
        if value_x ^= .;
    run;
 
-Which doesn't work in pandas.  Instead, the ``pd.isna`` or ``pd.notna`` functions
-should be used for comparisons.
-
-.. ipython:: python
-
-   outer_join[pd.isna(outer_join["value_x"])]
-   outer_join[pd.notna(outer_join["value_x"])]
-
-pandas also provides a variety of methods to work with missing data - some of
-which would be challenging to express in SAS. For example, there are methods to
-drop all rows with any missing values, replacing missing values with a specified
-value, like the mean, or forward filling from previous rows. See the
-:ref:`missing data documentation<missing_data>` for more.
-
-.. ipython:: python
-
-   outer_join.dropna()
-   outer_join.fillna(method="ffill")
-   outer_join["value_x"].fillna(outer_join["value_x"].mean())
+.. include:: includes/missing.rst
 
 
 GroupBy
@@ -549,7 +468,7 @@ GroupBy
 Aggregation
 ~~~~~~~~~~~
 
-SAS's PROC SUMMARY can be used to group by one or
+SAS's ``PROC SUMMARY`` can be used to group by one or
 more key variables and compute aggregations on
 numeric columns.
 
@@ -561,14 +480,7 @@ numeric columns.
        output out=tips_summed sum=;
    run;
 
-pandas provides a flexible ``groupby`` mechanism that
-allows similar aggregations.  See the :ref:`groupby documentation<groupby>`
-for more details and examples.
-
-.. ipython:: python
-
-   tips_summed = tips.groupby(["sex", "smoker"])[["total_bill", "tip"]].sum()
-   tips_summed.head()
+.. include:: includes/groupby.rst
 
 
 Transformation
@@ -597,16 +509,7 @@ example, to subtract the mean for each observation by smoker group.
        if a and b;
    run;
 
-
-pandas ``groupby`` provides a ``transform`` mechanism that allows
-these type of operations to be succinctly expressed in one
-operation.
-
-.. ipython:: python
-
-   gb = tips.groupby("smoker")["total_bill"]
-   tips["adj_total_bill"] = tips["total_bill"] - gb.transform("mean")
-   tips.head()
+.. include:: includes/transform.rst
 
 
 By group processing
diff --git a/doc/source/getting_started/comparison/comparison_with_stata.rst b/doc/source/getting_started/comparison/comparison_with_stata.rst
index d1ad18bddb0a7..ca536e7273870 100644
--- a/doc/source/getting_started/comparison/comparison_with_stata.rst
+++ b/doc/source/getting_started/comparison/comparison_with_stata.rst
@@ -311,15 +311,7 @@ first position of the substring you supply as the second argument.
 
    generate str_position = strpos(sex, "ale")
 
-Python determines the position of a character in a string with the
-:func:`find` function.  ``find`` searches for the first position of the
-substring.  If the substring is found, the function returns its
-position.  Keep in mind that Python indexes are zero-based and
-the function will return -1 if it fails to find the substring.
-
-.. ipython:: python
-
-   tips["sex"].str.find("ale").head()
+.. include:: includes/find_substring.rst
 
 
 Extracting substring by position
@@ -331,13 +323,7 @@ Stata extracts a substring from a string based on its position with the :func:`s
 
    generate short_sex = substr(sex, 1, 1)
 
-With pandas you can use ``[]`` notation to extract a substring
-from a string by position locations.  Keep in mind that Python
-indexes are zero-based.
-
-.. ipython:: python
-
-   tips["sex"].str[0:1].head()
+.. include:: includes/extract_substring.rst
 
 
 Extracting nth word
@@ -358,16 +344,7 @@ second argument specifies which word you want to extract.
    generate first_name = word(name, 1)
    generate last_name = word(name, -1)
 
-Python extracts a substring from a string based on its text
-by using regular expressions. There are much more powerful
-approaches, but this just shows a simple approach.
-
-.. ipython:: python
-
-   firstlast = pd.DataFrame({"string": ["John Smith", "Jane Cook"]})
-   firstlast["First_Name"] = firstlast["string"].str.split(" ", expand=True)[0]
-   firstlast["Last_Name"] = firstlast["string"].str.rsplit(" ", expand=True)[0]
-   firstlast
+.. include:: includes/nth_word.rst
 
 
 Changing case
@@ -390,27 +367,13 @@ change the case of ASCII and Unicode strings, respectively.
    generate title = strproper(string)
    list
 
-The equivalent Python functions are ``upper``, ``lower``, and ``title``.
-
-.. ipython:: python
+.. include:: includes/case.rst
 
-   firstlast = pd.DataFrame({"string": ["John Smith", "Jane Cook"]})
-   firstlast["upper"] = firstlast["string"].str.upper()
-   firstlast["lower"] = firstlast["string"].str.lower()
-   firstlast["title"] = firstlast["string"].str.title()
-   firstlast
 
 Merging
 -------
 
-The following tables will be used in the merge examples
-
-.. ipython:: python
-
-   df1 = pd.DataFrame({"key": ["A", "B", "C", "D"], "value": np.random.randn(4)})
-   df1
-   df2 = pd.DataFrame({"key": ["B", "D", "D", "E"], "value": np.random.randn(4)})
-   df2
+.. include:: includes/merge_setup.rst
 
 In Stata, to perform a merge, one data set must be in memory
 and the other must be referenced as a file name on disk. In
@@ -465,38 +428,13 @@ or the intersection of the two by using the values created in the
    restore
    merge 1:n key using df2.dta
 
-pandas DataFrames have a :meth:`DataFrame.merge` method, which provides
-similar functionality. Note that different join
-types are accomplished via the ``how`` keyword.
-
-.. ipython:: python
-
-   inner_join = df1.merge(df2, on=["key"], how="inner")
-   inner_join
-
-   left_join = df1.merge(df2, on=["key"], how="left")
-   left_join
-
-   right_join = df1.merge(df2, on=["key"], how="right")
-   right_join
-
-   outer_join = df1.merge(df2, on=["key"], how="outer")
-   outer_join
+.. include:: includes/merge_setup.rst
 
 
 Missing data
 ------------
 
-Like Stata, pandas has a representation for missing data -- the
-special float value ``NaN`` (not a number).  Many of the semantics
-are the same; for example missing data propagates through numeric
-operations, and is ignored by default for aggregations.
-
-.. ipython:: python
-
-   outer_join
-   outer_join["value_x"] + outer_join["value_y"]
-   outer_join["value_x"].sum()
+.. include:: includes/missing_intro.rst
 
 One difference is that missing data cannot be compared to its sentinel value.
 For example, in Stata you could do this to filter missing values.
@@ -508,30 +446,7 @@ For example, in Stata you could do this to filter missing values.
    * Keep non-missing values
    list if value_x != .
 
-This doesn't work in pandas.  Instead, the :func:`pd.isna` or :func:`pd.notna` functions
-should be used for comparisons.
-
-.. ipython:: python
-
-   outer_join[pd.isna(outer_join["value_x"])]
-   outer_join[pd.notna(outer_join["value_x"])]
-
-pandas also provides a variety of methods to work with missing data -- some of
-which would be challenging to express in Stata. For example, there are methods to
-drop all rows with any missing values, replacing missing values with a specified
-value, like the mean, or forward filling from previous rows. See the
-:ref:`missing data documentation<missing_data>` for more.
-
-.. ipython:: python
-
-   # Drop rows with any missing value
-   outer_join.dropna()
-
-   # Fill forwards
-   outer_join.fillna(method="ffill")
-
-   # Impute missing values with the mean
-   outer_join["value_x"].fillna(outer_join["value_x"].mean())
+.. include:: includes/missing.rst
 
 
 GroupBy
@@ -548,14 +463,7 @@ numeric columns.
 
    collapse (sum) total_bill tip, by(sex smoker)
 
-pandas provides a flexible ``groupby`` mechanism that
-allows similar aggregations.  See the :ref:`groupby documentation<groupby>`
-for more details and examples.
-
-.. ipython:: python
-
-   tips_summed = tips.groupby(["sex", "smoker"])[["total_bill", "tip"]].sum()
-   tips_summed.head()
+.. include:: includes/groupby.rst
 
 
 Transformation
@@ -570,16 +478,7 @@ For example, to subtract the mean for each observation by smoker group.
    bysort sex smoker: egen group_bill = mean(total_bill)
    generate adj_total_bill = total_bill - group_bill
 
-
-pandas ``groupby`` provides a ``transform`` mechanism that allows
-these type of operations to be succinctly expressed in one
-operation.
-
-.. ipython:: python
-
-   gb = tips.groupby("smoker")["total_bill"]
-   tips["adj_total_bill"] = tips["total_bill"] - gb.transform("mean")
-   tips.head()
+.. include:: includes/transform.rst
 
 
 By group processing
diff --git a/doc/source/getting_started/comparison/includes/case.rst b/doc/source/getting_started/comparison/includes/case.rst
new file mode 100644
index 0000000000000..c00a830bc8511
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/case.rst
@@ -0,0 +1,10 @@
+The equivalent pandas methods are :meth:`Series.str.upper`, :meth:`Series.str.lower`, and
+:meth:`Series.str.title`.
+
+.. ipython:: python
+
+   firstlast = pd.DataFrame({"string": ["John Smith", "Jane Cook"]})
+   firstlast["upper"] = firstlast["string"].str.upper()
+   firstlast["lower"] = firstlast["string"].str.lower()
+   firstlast["title"] = firstlast["string"].str.title()
+   firstlast
diff --git a/doc/source/getting_started/comparison/includes/extract_substring.rst b/doc/source/getting_started/comparison/includes/extract_substring.rst
new file mode 100644
index 0000000000000..78eee286ad467
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/extract_substring.rst
@@ -0,0 +1,7 @@
+With pandas you can use ``[]`` notation to extract a substring
+from a string by position locations. Keep in mind that Python
+indexes are zero-based.
+
+.. ipython:: python
+
+   tips["sex"].str[0:1].head()
diff --git a/doc/source/getting_started/comparison/includes/find_substring.rst b/doc/source/getting_started/comparison/includes/find_substring.rst
new file mode 100644
index 0000000000000..ee940b64f5cae
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/find_substring.rst
@@ -0,0 +1,8 @@
+You can find the position of a character in a column of strings with the :meth:`Series.str.find`
+method. ``find`` searches for the first position of the substring. If the substring is found, the
+method returns its position. If not found, it returns ``-1``. Keep in mind that Python indexes are
+zero-based.
+
+.. ipython:: python
+
+   tips["sex"].str.find("ale").head()
diff --git a/doc/source/getting_started/comparison/includes/groupby.rst b/doc/source/getting_started/comparison/includes/groupby.rst
new file mode 100644
index 0000000000000..caa9f6ec9c9b8
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/groupby.rst
@@ -0,0 +1,7 @@
+pandas provides a flexible ``groupby`` mechanism that allows similar aggregations. See the
+:ref:`groupby documentation<groupby>` for more details and examples.
+
+.. ipython:: python
+
+   tips_summed = tips.groupby(["sex", "smoker"])[["total_bill", "tip"]].sum()
+   tips_summed.head()
diff --git a/doc/source/getting_started/comparison/includes/length.rst b/doc/source/getting_started/comparison/includes/length.rst
index 9581c661c0170..5a0c803e9eff2 100644
--- a/doc/source/getting_started/comparison/includes/length.rst
+++ b/doc/source/getting_started/comparison/includes/length.rst
@@ -1,4 +1,4 @@
-Python determines the length of a character string with the ``len`` function.
+You can find the length of a character string with :meth:`Series.str.len`.
 In Python 3, all strings are Unicode strings. ``len`` includes trailing blanks.
 Use ``len`` and ``rstrip`` to exclude trailing blanks.
 
diff --git a/doc/source/getting_started/comparison/includes/merge.rst b/doc/source/getting_started/comparison/includes/merge.rst
new file mode 100644
index 0000000000000..b8e3f54fd132b
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/merge.rst
@@ -0,0 +1,17 @@
+pandas DataFrames have a :meth:`~DataFrame.merge` method, which provides similar functionality. The
+data does not have to be sorted ahead of time, and different join types are accomplished via the
+``how`` keyword.
+
+.. ipython:: python
+
+   inner_join = df1.merge(df2, on=["key"], how="inner")
+   inner_join
+
+   left_join = df1.merge(df2, on=["key"], how="left")
+   left_join
+
+   right_join = df1.merge(df2, on=["key"], how="right")
+   right_join
+
+   outer_join = df1.merge(df2, on=["key"], how="outer")
+   outer_join
diff --git a/doc/source/getting_started/comparison/includes/merge_setup.rst b/doc/source/getting_started/comparison/includes/merge_setup.rst
new file mode 100644
index 0000000000000..f115cd58f7a94
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/merge_setup.rst
@@ -0,0 +1,8 @@
+The following tables will be used in the merge examples:
+
+.. ipython:: python
+
+   df1 = pd.DataFrame({"key": ["A", "B", "C", "D"], "value": np.random.randn(4)})
+   df1
+   df2 = pd.DataFrame({"key": ["B", "D", "D", "E"], "value": np.random.randn(4)})
+   df2
diff --git a/doc/source/getting_started/comparison/includes/missing.rst b/doc/source/getting_started/comparison/includes/missing.rst
new file mode 100644
index 0000000000000..8e6ba95e98036
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/missing.rst
@@ -0,0 +1,24 @@
+This doesn't work in pandas.  Instead, the :func:`pd.isna` or :func:`pd.notna` functions
+should be used for comparisons.
+
+.. ipython:: python
+
+   outer_join[pd.isna(outer_join["value_x"])]
+   outer_join[pd.notna(outer_join["value_x"])]
+
+pandas also provides a variety of methods to work with missing data -- some of
+which would be challenging to express in Stata. For example, there are methods to
+drop all rows with any missing values, replacing missing values with a specified
+value, like the mean, or forward filling from previous rows. See the
+:ref:`missing data documentation<missing_data>` for more.
+
+.. ipython:: python
+
+   # Drop rows with any missing value
+   outer_join.dropna()
+
+   # Fill forwards
+   outer_join.fillna(method="ffill")
+
+   # Impute missing values with the mean
+   outer_join["value_x"].fillna(outer_join["value_x"].mean())
diff --git a/doc/source/getting_started/comparison/includes/missing_intro.rst b/doc/source/getting_started/comparison/includes/missing_intro.rst
new file mode 100644
index 0000000000000..ed97f639f3f3d
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/missing_intro.rst
@@ -0,0 +1,9 @@
+Both have a representation for missing data — pandas' is the special float value ``NaN`` (not a
+number).  Many of the semantics are the same; for example missing data propagates through numeric
+operations, and is ignored by default for aggregations.
+
+.. ipython:: python
+
+   outer_join
+   outer_join["value_x"] + outer_join["value_y"]
+   outer_join["value_x"].sum()
diff --git a/doc/source/getting_started/comparison/includes/nth_word.rst b/doc/source/getting_started/comparison/includes/nth_word.rst
new file mode 100644
index 0000000000000..7af0285005d5b
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/nth_word.rst
@@ -0,0 +1,9 @@
+The simplest way to extract words in pandas is to split the strings by spaces, then reference the
+word by index. Note there are more powerful approaches should you need them.
+
+.. ipython:: python
+
+   firstlast = pd.DataFrame({"String": ["John Smith", "Jane Cook"]})
+   firstlast["First_Name"] = firstlast["String"].str.split(" ", expand=True)[0]
+   firstlast["Last_Name"] = firstlast["String"].str.rsplit(" ", expand=True)[0]
+   firstlast
diff --git a/doc/source/getting_started/comparison/includes/sorting.rst b/doc/source/getting_started/comparison/includes/sorting.rst
index 23f11ff485474..0840c9dd554b7 100644
--- a/doc/source/getting_started/comparison/includes/sorting.rst
+++ b/doc/source/getting_started/comparison/includes/sorting.rst
@@ -1,5 +1,4 @@
-pandas objects have a :meth:`DataFrame.sort_values` method, which
-takes a list of columns to sort by.
+pandas has a :meth:`DataFrame.sort_values` method, which takes a list of columns to sort by.
 
 .. ipython:: python
 
diff --git a/doc/source/getting_started/comparison/includes/transform.rst b/doc/source/getting_started/comparison/includes/transform.rst
new file mode 100644
index 0000000000000..0aa5b5b298cf7
--- /dev/null
+++ b/doc/source/getting_started/comparison/includes/transform.rst
@@ -0,0 +1,8 @@
+pandas provides a :ref:`groupby.transform` mechanism that allows these type of operations to be
+succinctly expressed in one operation.
+
+.. ipython:: python
+
+   gb = tips.groupby("smoker")["total_bill"]
+   tips["adj_total_bill"] = tips["total_bill"] - gb.transform("mean")
+   tips.head()