DOC: Fix typos and improve parts of docs (#33793)

eu42 · web-flow · commit 0ede9478ac58 · 2020-05-10T17:32:32.000-07:00
* Fix typos and improve parts of docs

* Apply review suggestions
diff --git a/doc/source/getting_started/intro_tutorials/02_read_write.rst b/doc/source/getting_started/intro_tutorials/02_read_write.rst
@@ -23,7 +23,7 @@
                     <div class="card-body">
                         <p class="card-text">
 
-This tutorial uses the titanic data set, stored as CSV. The data
+This tutorial uses the Titanic data set, stored as CSV. The data
 consists of the following data columns:
 
 -  PassengerId: Id of every passenger.
@@ -61,7 +61,7 @@ How do I read and write tabular data?
     <ul class="task-bullet">
         <li>
 
-I want to analyse the titanic passenger data, available as a CSV file.
+I want to analyze the Titanic passenger data, available as a CSV file.
 
 .. ipython:: python
 
@@ -134,7 +134,7 @@ strings (``object``).
     <ul class="task-bullet">
         <li>
 
-My colleague requested the titanic data as a spreadsheet.
+My colleague requested the Titanic data as a spreadsheet.
 
 .. ipython:: python
 
diff --git a/doc/source/getting_started/intro_tutorials/03_subset_data.rst b/doc/source/getting_started/intro_tutorials/03_subset_data.rst
@@ -330,7 +330,7 @@ When using the column names, row labels or a condition expression, use
 the ``loc`` operator in front of the selection brackets ``[]``. For both
 the part before and after the comma, you can use a single label, a list
 of labels, a slice of labels, a conditional expression or a colon. Using
-a colon specificies you want to select all rows or columns.
+a colon specifies you want to select all rows or columns.
 
 .. raw:: html
 
diff --git a/doc/source/getting_started/intro_tutorials/06_calculate_statistics.rst b/doc/source/getting_started/intro_tutorials/06_calculate_statistics.rst
@@ -23,7 +23,7 @@
                     <div class="card-body">
                         <p class="card-text">
 
-This tutorial uses the titanic data set, stored as CSV. The data
+This tutorial uses the Titanic data set, stored as CSV. The data
 consists of the following data columns:
 
 -  PassengerId: Id of every passenger.
@@ -72,7 +72,7 @@ Aggregating statistics
     <ul class="task-bullet">
         <li>
 
-What is the average age of the titanic passengers?
+What is the average age of the Titanic passengers?
 
 .. ipython:: python
 
@@ -95,7 +95,7 @@ across rows by default.
     <ul class="task-bullet">
         <li>
 
-What is the median age and ticket fare price of the titanic passengers?
+What is the median age and ticket fare price of the Titanic passengers?
 
 .. ipython:: python
 
@@ -148,7 +148,7 @@ Aggregating statistics grouped by category
     <ul class="task-bullet">
         <li>
 
-What is the average age for male versus female titanic passengers?
+What is the average age for male versus female Titanic passengers?
 
 .. ipython:: python
 
diff --git a/doc/source/getting_started/intro_tutorials/07_reshape_table_layout.rst b/doc/source/getting_started/intro_tutorials/07_reshape_table_layout.rst
@@ -23,7 +23,7 @@
                     <div class="card-body">
                         <p class="card-text">
 
-This tutorial uses the titanic data set, stored as CSV. The data
+This tutorial uses the Titanic data set, stored as CSV. The data
 consists of the following data columns:
 
 -  PassengerId: Id of every passenger.
@@ -122,7 +122,7 @@ Sort table rows
     <ul class="task-bullet">
         <li>
 
-I want to sort the titanic data according to the age of the passengers.
+I want to sort the Titanic data according to the age of the passengers.
 
 .. ipython:: python
 
@@ -138,7 +138,7 @@ I want to sort the titanic data according to the age of the passengers.
     <ul class="task-bullet">
         <li>
 
-I want to sort the titanic data according to the cabin class and age in descending order.
+I want to sort the Titanic data according to the cabin class and age in descending order.
 
 .. ipython:: python
 
@@ -282,7 +282,7 @@ For more information about :meth:`~DataFrame.pivot_table`, see the user guide se
    </div>
 
 .. note::
-    If case you are wondering, :meth:`~DataFrame.pivot_table` is indeed directly linked
+    In case you are wondering, :meth:`~DataFrame.pivot_table` is indeed directly linked
     to :meth:`~DataFrame.groupby`. The same result can be derived by grouping on both
     ``parameter`` and ``location``:
 
@@ -338,7 +338,7 @@ newly created column.
 
 The solution is the short version on how to apply :func:`pandas.melt`. The method
 will *melt* all columns NOT mentioned in ``id_vars`` together into two
-columns: A columns with the column header names and a column with the
+columns: A column with the column header names and a column with the
 values itself. The latter column gets by default the name ``value``.
 
 The :func:`pandas.melt` method can be defined in more detail:
@@ -357,8 +357,8 @@ The result in the same, but in more detail defined:
 
 -  ``value_vars`` defines explicitly which columns to *melt* together
 -  ``value_name`` provides a custom column name for the values column
-   instead of the default columns name ``value``
--  ``var_name`` provides a custom column name for the columns collecting
+   instead of the default column name ``value``
+-  ``var_name`` provides a custom column name for the column collecting
    the column header names. Otherwise it takes the index name or a
    default ``variable``
 
@@ -383,7 +383,7 @@ Conversion from wide to long format with :func:`pandas.melt` is explained in the
         <h4>REMEMBER</h4>
 
 -  Sorting by one or more columns is supported by ``sort_values``
--  The ``pivot`` function is purely restructering of the data,
+-  The ``pivot`` function is purely restructuring of the data,
    ``pivot_table`` supports aggregations
 -  The reverse of ``pivot`` (long to wide format) is ``melt`` (wide to
    long format)
diff --git a/doc/source/getting_started/intro_tutorials/08_combine_dataframes.rst b/doc/source/getting_started/intro_tutorials/08_combine_dataframes.rst
@@ -305,7 +305,7 @@ More information on join/merge of tables is provided in the user guide section o
     <div class="shadow gs-callout gs-callout-remember">
         <h4>REMEMBER</h4>
 
--  Multiple tables can be concatenated both column as row wise using
+-  Multiple tables can be concatenated both column-wise and row-wise using
    the ``concat`` function.
 -  For database-like merging/joining of tables, use the ``merge``
    function.
diff --git a/doc/source/getting_started/intro_tutorials/09_timeseries.rst b/doc/source/getting_started/intro_tutorials/09_timeseries.rst
@@ -78,7 +78,7 @@ provide any datetime operations (e.g. extract the year, day of the
 week,…). By applying the ``to_datetime`` function, pandas interprets the
 strings and convert these to datetime (i.e. ``datetime64[ns, UTC]``)
 objects. In pandas we call these datetime objects similar to
-``datetime.datetime`` from the standard library a :class:`pandas.Timestamp`.
+``datetime.datetime`` from the standard library as :class:`pandas.Timestamp`.
 
 .. raw:: html
 
@@ -99,7 +99,7 @@ objects. In pandas we call these datetime objects similar to
 Why are these :class:`pandas.Timestamp` objects useful? Let’s illustrate the added
 value with some example cases.
 
-   What is the start and end date of the time series data set working
+   What is the start and end date of the time series data set we are working
    with?
 
 .. ipython:: python
@@ -214,7 +214,7 @@ Plot the typical :math:`NO_2` pattern during the day of our time series of all s
 
 Similar to the previous case, we want to calculate a given statistic
 (e.g. mean :math:`NO_2`) **for each hour of the day** and we can use the
-split-apply-combine approach again. For this case, the datetime property ``hour``
+split-apply-combine approach again. For this case, we use the datetime property ``hour``
 of pandas ``Timestamp``, which is also accessible by the ``dt`` accessor.
 
 .. raw:: html
diff --git a/doc/source/getting_started/intro_tutorials/10_text_data.rst b/doc/source/getting_started/intro_tutorials/10_text_data.rst
@@ -23,7 +23,7 @@
                     <div class="card-body">
                         <p class="card-text">
 
-This tutorial uses the titanic data set, stored as CSV. The data
+This tutorial uses the Titanic data set, stored as CSV. The data
 consists of the following data columns:
 
 -  PassengerId: Id of every passenger.
@@ -102,7 +102,7 @@ Create a new column ``Surname`` that contains the surname of the Passengers by e
 
 Using the :meth:`Series.str.split` method, each of the values is returned as a list of
 2 elements. The first element is the part before the comma and the
-second element the part after the comma.
+second element is the part after the comma.
 
 .. ipython:: python
 
@@ -135,7 +135,7 @@ More information on extracting parts of strings is available in the user guide s
     <ul class="task-bullet">
         <li>
 
-Extract the passenger data about the Countess on board of the Titanic.
+Extract the passenger data about the Countesses on board of the Titanic.
 
 .. ipython:: python
 
@@ -145,24 +145,24 @@ Extract the passenger data about the Countess on board of the Titanic.
 
     titanic[titanic["Name"].str.contains("Countess")]
 
-(*Interested in her story? See*\ `Wikipedia <https://en.wikipedia.org/wiki/No%C3%ABl_Leslie,_Countess_of_Rothes>`__\ *!*)
+(*Interested in her story? See *\ `Wikipedia <https://en.wikipedia.org/wiki/No%C3%ABl_Leslie,_Countess_of_Rothes>`__\ *!*)
 
 The string method :meth:`Series.str.contains` checks for each of the values in the
 column ``Name`` if the string contains the word ``Countess`` and returns
 for each of the values ``True`` (``Countess`` is part of the name) of
-``False`` (``Countess`` is notpart of the name). This output can be used
+``False`` (``Countess`` is not part of the name). This output can be used
 to subselect the data using conditional (boolean) indexing introduced in
 the :ref:`subsetting of data tutorial <10min_tut_03_subset>`. As there was
-only 1 Countess on the Titanic, we get one row as a result.
+only one Countess on the Titanic, we get one row as a result.
 
 .. raw:: html
 
         </li>
     </ul>
 
 .. note::
-    More powerful extractions on strings is supported, as the
-    :meth:`Series.str.contains` and :meth:`Series.str.extract` methods accepts `regular
+    More powerful extractions on strings are supported, as the
+    :meth:`Series.str.contains` and :meth:`Series.str.extract` methods accept `regular
     expressions <https://docs.python.org/3/library/re.html>`__, but out of
     scope of this tutorial.
 
@@ -182,7 +182,7 @@ More information on extracting parts of strings is available in the user guide s
     <ul class="task-bullet">
         <li>
 
-Which passenger of the titanic has the longest name?
+Which passenger of the Titanic has the longest name?
 
 .. ipython:: python
 
@@ -220,7 +220,7 @@ we can do a selection using the ``loc`` operator, introduced in the
     <ul class="task-bullet">
         <li>
 
-In the ‘Sex’ columns, replace values of ‘male’ by ‘M’ and all ‘female’ values by ‘F’
+In the "Sex" column, replace values of "male" by "M" and values of "female" by "F"
 
 .. ipython:: python
 
diff --git a/doc/source/user_guide/10min.rst b/doc/source/user_guide/10min.rst
@@ -664,7 +664,7 @@ Convert the raw grades to a categorical data type.
     df["grade"]
 
 Rename the categories to more meaningful names (assigning to
-:meth:`Series.cat.categories` is inplace!).
+:meth:`Series.cat.categories` is in place!).
 
 .. ipython:: python
 
diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst
@@ -68,7 +68,7 @@ the ``.array`` property
    s.index.array
 
 :attr:`~Series.array` will always be an :class:`~pandas.api.extensions.ExtensionArray`.
-The exact details of what an :class:`~pandas.api.extensions.ExtensionArray` is and why pandas uses them is a bit
+The exact details of what an :class:`~pandas.api.extensions.ExtensionArray` is and why pandas uses them are a bit
 beyond the scope of this introduction. See :ref:`basics.dtypes` for more.
 
 If you know you need a NumPy array, use :meth:`~Series.to_numpy`
@@ -518,7 +518,7 @@ data (``True`` by default):
 
 Combined with the broadcasting / arithmetic behavior, one can describe various
 statistical procedures, like standardization (rendering data zero mean and
-standard deviation 1), very concisely:
+standard deviation of 1), very concisely:
 
 .. ipython:: python
 
@@ -700,7 +700,7 @@ By default all columns are used but a subset can be selected using the ``subset`
     frame = pd.DataFrame(data)
     frame.value_counts()
 
-Similarly, you can get the most frequently occurring value(s) (the mode) of the values in a Series or DataFrame:
+Similarly, you can get the most frequently occurring value(s), i.e. the mode, of the values in a Series or DataFrame:
 
 .. ipython:: python
 
@@ -1022,7 +1022,7 @@ Mixed dtypes
 ++++++++++++
 
 When presented with mixed dtypes that cannot aggregate, ``.agg`` will only take the valid
-aggregations. This is similar to how groupby ``.agg`` works.
+aggregations. This is similar to how ``.groupby.agg`` works.
 
 .. ipython:: python
 
@@ -1041,7 +1041,7 @@ aggregations. This is similar to how groupby ``.agg`` works.
 Custom describe
 +++++++++++++++
 
-With ``.agg()`` is it possible to easily create a custom describe function, similar
+With ``.agg()`` it is possible to easily create a custom describe function, similar
 to the built in :ref:`describe function <basics.describe>`.
 
 .. ipython:: python
@@ -1083,7 +1083,8 @@ function name or a user defined function.
    tsdf.transform('abs')
    tsdf.transform(lambda x: x.abs())
 
-Here :meth:`~DataFrame.transform` received a single function; this is equivalent to a ufunc application.
+Here :meth:`~DataFrame.transform` received a single function; this is equivalent to a `ufunc
+<https://numpy.org/doc/stable/reference/ufuncs.html>`__ application.
 
 .. ipython:: python
 
@@ -1457,7 +1458,7 @@ for altering the ``Series.name`` attribute.
 
 .. versionadded:: 0.24.0
 
-The methods :meth:`~DataFrame.rename_axis` and :meth:`~Series.rename_axis`
+The methods :meth:`DataFrame.rename_axis` and :meth:`Series.rename_axis`
 allow specific names of a `MultiIndex` to be changed (as opposed to the
 labels).