From 6565ac6ccde0e0de339999f61c652986ee9b9a05 Mon Sep 17 00:00:00 2001 From: Aidan Feldman Date: Mon, 4 Jan 2021 03:40:17 -0500 Subject: [PATCH] DOC: minor tweaks to formatting on SQL comparison page Adding code formatting, missing punctuation, etc. No changes of substance. --- .../comparison/comparison_with_sql.rst | 53 ++++++++++--------- .../comparison/includes/filtering.rst | 2 +- 2 files changed, 29 insertions(+), 26 deletions(-) diff --git a/doc/source/getting_started/comparison/comparison_with_sql.rst b/doc/source/getting_started/comparison/comparison_with_sql.rst index 52799442d6118..75d26354ddfa5 100644 --- a/doc/source/getting_started/comparison/comparison_with_sql.rst +++ b/doc/source/getting_started/comparison/comparison_with_sql.rst @@ -69,31 +69,31 @@ Filtering in SQL is done via a WHERE clause. .. include:: includes/filtering.rst -Just like SQL's OR and AND, multiple conditions can be passed to a DataFrame using | (OR) and & -(AND). +Just like SQL's ``OR`` and ``AND``, multiple conditions can be passed to a DataFrame using ``|`` +(``OR``) and ``&`` (``AND``). + +Tips of more than $5 at Dinner meals: .. code-block:: sql - -- tips of more than $5.00 at Dinner meals SELECT * FROM tips WHERE time = 'Dinner' AND tip > 5.00; .. ipython:: python - # tips of more than $5.00 at Dinner meals tips[(tips["time"] == "Dinner") & (tips["tip"] > 5.00)] +Tips by parties of at least 5 diners OR bill total was more than $45: + .. code-block:: sql - -- tips by parties of at least 5 diners OR bill total was more than $45 SELECT * FROM tips WHERE size >= 5 OR total_bill > 45; .. ipython:: python - # tips by parties of at least 5 diners OR bill total was more than $45 tips[(tips["size"] >= 5) | (tips["total_bill"] > 45)] NULL checking is done using the :meth:`~pandas.Series.notna` and :meth:`~pandas.Series.isna` @@ -134,7 +134,7 @@ Getting items where ``col1`` IS NOT NULL can be done with :meth:`~pandas.Series. GROUP BY -------- -In pandas, SQL's GROUP BY operations are performed using the similarly named +In pandas, SQL's ``GROUP BY`` operations are performed using the similarly named :meth:`~pandas.DataFrame.groupby` method. :meth:`~pandas.DataFrame.groupby` typically refers to a process where we'd like to split a dataset into groups, apply some function (typically aggregation) , and then combine the groups together. @@ -162,7 +162,7 @@ The pandas equivalent would be: Notice that in the pandas code we used :meth:`~pandas.core.groupby.DataFrameGroupBy.size` and not :meth:`~pandas.core.groupby.DataFrameGroupBy.count`. This is because :meth:`~pandas.core.groupby.DataFrameGroupBy.count` applies the function to each column, returning -the number of ``not null`` records within each. +the number of ``NOT NULL`` records within each. .. ipython:: python @@ -223,10 +223,10 @@ Grouping by more than one column is done by passing a list of columns to the JOIN ---- -JOINs can be performed with :meth:`~pandas.DataFrame.join` or :meth:`~pandas.merge`. By default, -:meth:`~pandas.DataFrame.join` will join the DataFrames on their indices. Each method has -parameters allowing you to specify the type of join to perform (LEFT, RIGHT, INNER, FULL) or the -columns to join on (column names or indices). +``JOIN``\s can be performed with :meth:`~pandas.DataFrame.join` or :meth:`~pandas.merge`. By +default, :meth:`~pandas.DataFrame.join` will join the DataFrames on their indices. Each method has +parameters allowing you to specify the type of join to perform (``LEFT``, ``RIGHT``, ``INNER``, +``FULL``) or the columns to join on (column names or indices). .. ipython:: python @@ -235,7 +235,7 @@ columns to join on (column names or indices). Assume we have two database tables of the same name and structure as our DataFrames. -Now let's go over the various types of JOINs. +Now let's go over the various types of ``JOIN``\s. INNER JOIN ~~~~~~~~~~ @@ -261,9 +261,11 @@ column with another DataFrame's index. LEFT OUTER JOIN ~~~~~~~~~~~~~~~ + +Show all records from ``df1``. + .. code-block:: sql - -- show all records from df1 SELECT * FROM df1 LEFT OUTER JOIN df2 @@ -271,14 +273,15 @@ LEFT OUTER JOIN .. ipython:: python - # show all records from df1 pd.merge(df1, df2, on="key", how="left") RIGHT JOIN ~~~~~~~~~~ + +Show all records from ``df2``. + .. code-block:: sql - -- show all records from df2 SELECT * FROM df1 RIGHT OUTER JOIN df2 @@ -286,17 +289,17 @@ RIGHT JOIN .. ipython:: python - # show all records from df2 pd.merge(df1, df2, on="key", how="right") FULL JOIN ~~~~~~~~~ -pandas also allows for FULL JOINs, which display both sides of the dataset, whether or not the -joined columns find a match. As of writing, FULL JOINs are not supported in all RDBMS (MySQL). +pandas also allows for ``FULL JOIN``\s, which display both sides of the dataset, whether or not the +joined columns find a match. As of writing, ``FULL JOIN``\s are not supported in all RDBMS (MySQL). + +Show all records from both tables. .. code-block:: sql - -- show all records from both tables SELECT * FROM df1 FULL OUTER JOIN df2 @@ -304,13 +307,13 @@ joined columns find a match. As of writing, FULL JOINs are not supported in all .. ipython:: python - # show all records from both frames pd.merge(df1, df2, on="key", how="outer") UNION ----- -UNION ALL can be performed using :meth:`~pandas.concat`. + +``UNION ALL`` can be performed using :meth:`~pandas.concat`. .. ipython:: python @@ -342,7 +345,7 @@ UNION ALL can be performed using :meth:`~pandas.concat`. pd.concat([df1, df2]) -SQL's UNION is similar to UNION ALL, however UNION will remove duplicate rows. +SQL's ``UNION`` is similar to ``UNION ALL``, however ``UNION`` will remove duplicate rows. .. code-block:: sql @@ -444,7 +447,7 @@ the same using ``rank(method='first')`` function Let's find tips with (rank < 3) per gender group for (tips < 2). Notice that when using ``rank(method='min')`` function ``rnk_min`` remains the same for the same ``tip`` -(as Oracle's RANK() function) +(as Oracle's ``RANK()`` function) .. ipython:: python @@ -477,7 +480,7 @@ DELETE DELETE FROM tips WHERE tip > 9; -In pandas we select the rows that should remain, instead of deleting them +In pandas we select the rows that should remain instead of deleting them: .. ipython:: python diff --git a/doc/source/getting_started/comparison/includes/filtering.rst b/doc/source/getting_started/comparison/includes/filtering.rst index 861a93d92c2c2..8ddf7c0d2fa39 100644 --- a/doc/source/getting_started/comparison/includes/filtering.rst +++ b/doc/source/getting_started/comparison/includes/filtering.rst @@ -1,5 +1,5 @@ DataFrames can be filtered in multiple ways; the most intuitive of which is using -:ref:`boolean indexing ` +:ref:`boolean indexing `. .. ipython:: python