Skip to content

CLN/DOC: Remove trailing whitespace from .rst files in doc folder #7317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 3, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 30 additions & 30 deletions doc/source/comparison_with_sql.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@

Comparison with SQL
********************
Since many potential pandas users have some familiarity with
`SQL <http://en.wikipedia.org/wiki/SQL>`_, this page is meant to provide some examples of how
Since many potential pandas users have some familiarity with
`SQL <http://en.wikipedia.org/wiki/SQL>`_, this page is meant to provide some examples of how
various SQL operations would be performed using pandas.

If you're new to pandas, you might want to first read through :ref:`10 Minutes to Pandas<10min>`
If you're new to pandas, you might want to first read through :ref:`10 Minutes to Pandas<10min>`
to familiarize yourself with the library.

As is customary, we import pandas and numpy as follows:
Expand All @@ -17,8 +17,8 @@ As is customary, we import pandas and numpy as follows:
import pandas as pd
import numpy as np

Most of the examples will utilize the ``tips`` dataset found within pandas tests. We'll read
the data into a DataFrame called `tips` and assume we have a database table of the same name and
Most of the examples will utilize the ``tips`` dataset found within pandas tests. We'll read
the data into a DataFrame called `tips` and assume we have a database table of the same name and
structure.

.. ipython:: python
Expand All @@ -44,7 +44,7 @@ With pandas, column selection is done by passing a list of column names to your

tips[['total_bill', 'tip', 'smoker', 'time']].head(5)

Calling the DataFrame without the list of column names would display all columns (akin to SQL's
Calling the DataFrame without the list of column names would display all columns (akin to SQL's
``*``).

WHERE
Expand All @@ -58,14 +58,14 @@ Filtering in SQL is done via a WHERE clause.
WHERE time = 'Dinner'
LIMIT 5;

DataFrames can be filtered in multiple ways; the most intuitive of which is using
DataFrames can be filtered in multiple ways; the most intuitive of which is using
`boolean indexing <http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing>`_.

.. ipython:: python

tips[tips['time'] == 'Dinner'].head(5)

The above statement is simply passing a ``Series`` of True/False objects to the DataFrame,
The above statement is simply passing a ``Series`` of True/False objects to the DataFrame,
returning all rows with True.

.. ipython:: python
Expand All @@ -74,7 +74,7 @@ returning all rows with True.
is_dinner.value_counts()
tips[is_dinner].head(5)

Just like SQL's OR and AND, multiple conditions can be passed to a DataFrame using | (OR) and &
Just like SQL's OR and AND, multiple conditions can be passed to a DataFrame using | (OR) and &
(AND).

.. code-block:: sql
Expand All @@ -101,16 +101,16 @@ Just like SQL's OR and AND, multiple conditions can be passed to a DataFrame usi
# tips by parties of at least 5 diners OR bill total was more than $45
tips[(tips['size'] >= 5) | (tips['total_bill'] > 45)]

NULL checking is done using the :meth:`~pandas.Series.notnull` and :meth:`~pandas.Series.isnull`
NULL checking is done using the :meth:`~pandas.Series.notnull` and :meth:`~pandas.Series.isnull`
methods.

.. ipython:: python

frame = pd.DataFrame({'col1': ['A', 'B', np.NaN, 'C', 'D'],
'col2': ['F', np.NaN, 'G', 'H', 'I']})
frame

Assume we have a table of the same structure as our DataFrame above. We can see only the records
Assume we have a table of the same structure as our DataFrame above. We can see only the records
where ``col2`` IS NULL with the following query:

.. code-block:: sql
Expand Down Expand Up @@ -138,12 +138,12 @@ Getting items where ``col1`` IS NOT NULL can be done with :meth:`~pandas.Series.

GROUP BY
--------
In pandas, SQL's GROUP BY operations performed using the similarly named
:meth:`~pandas.DataFrame.groupby` method. :meth:`~pandas.DataFrame.groupby` typically refers to a
In pandas, SQL's GROUP BY operations performed using the similarly named
:meth:`~pandas.DataFrame.groupby` method. :meth:`~pandas.DataFrame.groupby` typically refers to a
process where we'd like to split a dataset into groups, apply some function (typically aggregation)
, and then combine the groups together.

A common SQL operation would be getting the count of records in each group throughout a dataset.
A common SQL operation would be getting the count of records in each group throughout a dataset.
For instance, a query getting us the number of tips left by sex:

.. code-block:: sql
Expand All @@ -163,23 +163,23 @@ The pandas equivalent would be:

tips.groupby('sex').size()

Notice that in the pandas code we used :meth:`~pandas.DataFrameGroupBy.size` and not
:meth:`~pandas.DataFrameGroupBy.count`. This is because :meth:`~pandas.DataFrameGroupBy.count`
Notice that in the pandas code we used :meth:`~pandas.DataFrameGroupBy.size` and not
:meth:`~pandas.DataFrameGroupBy.count`. This is because :meth:`~pandas.DataFrameGroupBy.count`
applies the function to each column, returning the number of ``not null`` records within each.

.. ipython:: python

tips.groupby('sex').count()

Alternatively, we could have applied the :meth:`~pandas.DataFrameGroupBy.count` method to an
Alternatively, we could have applied the :meth:`~pandas.DataFrameGroupBy.count` method to an
individual column:

.. ipython:: python

tips.groupby('sex')['total_bill'].count()

Multiple functions can also be applied at once. For instance, say we'd like to see how tip amount
differs by day of the week - :meth:`~pandas.DataFrameGroupBy.agg` allows you to pass a dictionary
Multiple functions can also be applied at once. For instance, say we'd like to see how tip amount
differs by day of the week - :meth:`~pandas.DataFrameGroupBy.agg` allows you to pass a dictionary
to your grouped DataFrame, indicating which functions to apply to specific columns.

.. code-block:: sql
Expand All @@ -198,7 +198,7 @@ to your grouped DataFrame, indicating which functions to apply to specific colum

tips.groupby('day').agg({'tip': np.mean, 'day': np.size})

Grouping by more than one column is done by passing a list of columns to the
Grouping by more than one column is done by passing a list of columns to the
:meth:`~pandas.DataFrame.groupby` method.

.. code-block:: sql
Expand All @@ -207,7 +207,7 @@ Grouping by more than one column is done by passing a list of columns to the
FROM tip
GROUP BY smoker, day;
/*
smoker day
smoker day
No Fri 4 2.812500
Sat 45 3.102889
Sun 57 3.167895
Expand All @@ -226,16 +226,16 @@ Grouping by more than one column is done by passing a list of columns to the

JOIN
----
JOINs can be performed with :meth:`~pandas.DataFrame.join` or :meth:`~pandas.merge`. By default,
:meth:`~pandas.DataFrame.join` will join the DataFrames on their indices. Each method has
parameters allowing you to specify the type of join to perform (LEFT, RIGHT, INNER, FULL) or the
JOINs can be performed with :meth:`~pandas.DataFrame.join` or :meth:`~pandas.merge`. By default,
:meth:`~pandas.DataFrame.join` will join the DataFrames on their indices. Each method has
parameters allowing you to specify the type of join to perform (LEFT, RIGHT, INNER, FULL) or the
columns to join on (column names or indices).

.. ipython:: python

df1 = pd.DataFrame({'key': ['A', 'B', 'C', 'D'],
'value': np.random.randn(4)})
df2 = pd.DataFrame({'key': ['B', 'D', 'D', 'E'],
df2 = pd.DataFrame({'key': ['B', 'D', 'D', 'E'],
'value': np.random.randn(4)})

Assume we have two database tables of the same name and structure as our DataFrames.
Expand All @@ -256,7 +256,7 @@ INNER JOIN
# merge performs an INNER JOIN by default
pd.merge(df1, df2, on='key')

:meth:`~pandas.merge` also offers parameters for cases when you'd like to join one DataFrame's
:meth:`~pandas.merge` also offers parameters for cases when you'd like to join one DataFrame's
column with another DataFrame's index.

.. ipython:: python
Expand Down Expand Up @@ -296,7 +296,7 @@ RIGHT JOIN

FULL JOIN
~~~~~~~~~
pandas also allows for FULL JOINs, which display both sides of the dataset, whether or not the
pandas also allows for FULL JOINs, which display both sides of the dataset, whether or not the
joined columns find a match. As of writing, FULL JOINs are not supported in all RDBMS (MySQL).

.. code-block:: sql
Expand Down Expand Up @@ -364,7 +364,7 @@ SQL's UNION is similar to UNION ALL, however UNION will remove duplicate rows.
Los Angeles 5
*/

In pandas, you can use :meth:`~pandas.concat` in conjunction with
In pandas, you can use :meth:`~pandas.concat` in conjunction with
:meth:`~pandas.DataFrame.drop_duplicates`.

.. ipython:: python
Expand All @@ -377,4 +377,4 @@ UPDATE


DELETE
------
------
2 changes: 1 addition & 1 deletion doc/source/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ accept the following arguments:
is min for ``rolling_min``, max for ``rolling_max``, median for
``rolling_median``, and mean for all other rolling functions. See
:meth:`DataFrame.resample`'s how argument for more information.

These functions can be applied to ndarrays or Series objects:

.. ipython:: python
Expand Down
2 changes: 1 addition & 1 deletion doc/source/gotchas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ index, not membership among the values.
2 in s
'b' in s

If this behavior is surprising, keep in mind that using ``in`` on a Python
If this behavior is surprising, keep in mind that using ``in`` on a Python
dictionary tests keys, not values, and Series are dict-like.
To test for membership in the values, use the method :func:`~pandas.Series.isin`:

Expand Down
4 changes: 2 additions & 2 deletions doc/source/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -216,9 +216,9 @@ new column.
sa
dfa.A = list(range(len(dfa.index))) # ok if A already exists
dfa
dfa['A'] = list(range(len(dfa.index))) # use this form to create a new column
dfa['A'] = list(range(len(dfa.index))) # use this form to create a new column
dfa

.. warning::

- You can use this access only if the index element is a valid python identifier, e.g. ``s.1`` is not allowed.
Expand Down
3 changes: 1 addition & 2 deletions doc/source/missing_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -598,7 +598,7 @@ You can also operate on the DataFrame in place

.. warning::

When replacing multiple ``bool`` or ``datetime64`` objects, the first
When replacing multiple ``bool`` or ``datetime64`` objects, the first
argument to ``replace`` (``to_replace``) must match the type of the value
being replaced type. For example,

Expand Down Expand Up @@ -669,4 +669,3 @@ However, these can be filled in using **fillna** and it will work fine:

reindexed[crit.fillna(False)]
reindexed[crit.fillna(True)]

6 changes: 3 additions & 3 deletions doc/source/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ resources for development through the end of 2011, and continues to contribute
bug reports today.

Since January 2012, `Lambda Foundry <http://www.lambdafoundry.com>`__, has
been providing development resources, as well as commercial support,
been providing development resources, as well as commercial support,
training, and consulting for pandas.

pandas is only made possible by a group of people around the world like you
Expand All @@ -114,8 +114,8 @@ collection of developers focused on the improvement of Python's data
libraries. The core team that coordinates development can be found on `Github
<http://github.com/pydata>`__. If you're interested in contributing, please
visit the `project website <http://pandas.pydata.org>`__.

License
-------

.. literalinclude:: ../../LICENSE
.. literalinclude:: ../../LICENSE
4 changes: 2 additions & 2 deletions doc/source/r_interface.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ rpy2 / R interface
If your computer has R and rpy2 (> 2.2) installed (which will be left to the
reader), you will be able to leverage the below functionality. On Windows,
doing this is quite an ordeal at the moment, but users on Unix-like systems
should find it quite easy. rpy2 evolves in time, and is currently reaching
should find it quite easy. rpy2 evolves in time, and is currently reaching
its release 2.3, while the current interface is
designed for the 2.2.x series. We recommend to use 2.2.x over other series
designed for the 2.2.x series. We recommend to use 2.2.x over other series
unless you are prepared to fix parts of the code, yet the rpy2-2.3.0
introduces improvements such as a better R-Python bridge memory management
layer so it might be a good idea to bite the bullet and submit patches for
Expand Down
3 changes: 1 addition & 2 deletions doc/source/reshaping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ It takes a number of arguments
- ``values``: a column or a list of columns to aggregate
- ``index``: a column, Grouper, array which has the same length as data, or list of them.
Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as column values.
- ``columns``: a column, Grouper, array which has the same length as data, or list of them.
- ``columns``: a column, Grouper, array which has the same length as data, or list of them.
Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values.
- ``aggfunc``: function to use for aggregation, defaulting to ``numpy.mean``

Expand Down Expand Up @@ -456,4 +456,3 @@ handling of NaN:

pd.factorize(x, sort=True)
np.unique(x, return_inverse=True)[::-1]

2 changes: 1 addition & 1 deletion doc/source/rplot.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ We import the rplot API:
Examples
--------

RPlot is a flexible API for producing Trellis plots. These plots allow you to arrange data in a rectangular grid by values of certain attributes.
RPlot is a flexible API for producing Trellis plots. These plots allow you to arrange data in a rectangular grid by values of certain attributes.

.. ipython:: python

Expand Down
4 changes: 2 additions & 2 deletions doc/source/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ are examples with real-world data, and all the bugs and weirdness that
that entails.

Here are links to the v0.1 release. For an up-to-date table of contents, see the `pandas-cookbook GitHub
repository <http://github.com/jvns/pandas-cookbook>`_. To run the examples in this tutorial, you'll need to
clone the GitHub repository and get IPython Notebook running.
repository <http://github.com/jvns/pandas-cookbook>`_. To run the examples in this tutorial, you'll need to
clone the GitHub repository and get IPython Notebook running.
See `How to use this cookbook <https://github.com/jvns/pandas-cookbook#how-to-use-this-cookbook>`_.

- `A quick tour of the IPython Notebook: <http://nbviewer.ipython.org/github/jvns/pandas-c|%2055ookbook/blob/v0.1/cookbook/A%20quick%20tour%20of%20IPython%20Notebook.ipynb>`_
Expand Down