Skip to content

Commit 726755f

Browse files
Merging upstream/master into my dev branch
# Conflicts: # doc/source/whatsnew/v0.23.2.txt
2 parents 53aca81 + 506935c commit 726755f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+1448
-983
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -110,3 +110,4 @@ doc/source/styled.xlsx
110110
doc/source/templates/
111111
env/
112112
doc/source/savefig/
113+
*my-dev-test.py

asv_bench/benchmarks/categoricals.py

+7-3
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,11 @@ class Contains(object):
202202
def setup(self):
203203
N = 10**5
204204
self.ci = tm.makeCategoricalIndex(N)
205-
self.cat = self.ci.categories[0]
205+
self.c = self.ci.values
206+
self.key = self.ci.categories[0]
206207

207-
def time_contains(self):
208-
self.cat in self.ci
208+
def time_categorical_index_contains(self):
209+
self.key in self.ci
210+
211+
def time_categorical_contains(self):
212+
self.key in self.c

doc/source/api.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -1200,9 +1200,9 @@ Attributes and underlying data
12001200
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12011201
**Axes**
12021202

1203-
* **items**: axis 0; each item corresponds to a DataFrame contained inside
1204-
* **major_axis**: axis 1; the index (rows) of each of the DataFrames
1205-
* **minor_axis**: axis 2; the columns of each of the DataFrames
1203+
* **items**: axis 0; each item corresponds to a DataFrame contained inside
1204+
* **major_axis**: axis 1; the index (rows) of each of the DataFrames
1205+
* **minor_axis**: axis 2; the columns of each of the DataFrames
12061206

12071207
.. autosummary::
12081208
:toctree: generated/

doc/source/basics.rst

+21-22
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,8 @@ Attributes and the raw ndarray(s)
5050

5151
pandas objects have a number of attributes enabling you to access the metadata
5252

53-
* **shape**: gives the axis dimensions of the object, consistent with ndarray
54-
* Axis labels
55-
53+
* **shape**: gives the axis dimensions of the object, consistent with ndarray
54+
* Axis labels
5655
* **Series**: *index* (only axis)
5756
* **DataFrame**: *index* (rows) and *columns*
5857
* **Panel**: *items*, *major_axis*, and *minor_axis*
@@ -131,9 +130,9 @@ Flexible binary operations
131130
With binary operations between pandas data structures, there are two key points
132131
of interest:
133132

134-
* Broadcasting behavior between higher- (e.g. DataFrame) and
135-
lower-dimensional (e.g. Series) objects.
136-
* Missing data in computations.
133+
* Broadcasting behavior between higher- (e.g. DataFrame) and
134+
lower-dimensional (e.g. Series) objects.
135+
* Missing data in computations.
137136

138137
We will demonstrate how to manage these issues independently, though they can
139138
be handled simultaneously.
@@ -462,10 +461,10 @@ produce an object of the same size. Generally speaking, these methods take an
462461
**axis** argument, just like *ndarray.{sum, std, ...}*, but the axis can be
463462
specified by name or integer:
464463

465-
- **Series**: no axis argument needed
466-
- **DataFrame**: "index" (axis=0, default), "columns" (axis=1)
467-
- **Panel**: "items" (axis=0), "major" (axis=1, default), "minor"
468-
(axis=2)
464+
* **Series**: no axis argument needed
465+
* **DataFrame**: "index" (axis=0, default), "columns" (axis=1)
466+
* **Panel**: "items" (axis=0), "major" (axis=1, default), "minor"
467+
(axis=2)
469468

470469
For example:
471470

@@ -1187,11 +1186,11 @@ It is used to implement nearly all other features relying on label-alignment
11871186
functionality. To *reindex* means to conform the data to match a given set of
11881187
labels along a particular axis. This accomplishes several things:
11891188

1190-
* Reorders the existing data to match a new set of labels
1191-
* Inserts missing value (NA) markers in label locations where no data for
1192-
that label existed
1193-
* If specified, **fill** data for missing labels using logic (highly relevant
1194-
to working with time series data)
1189+
* Reorders the existing data to match a new set of labels
1190+
* Inserts missing value (NA) markers in label locations where no data for
1191+
that label existed
1192+
* If specified, **fill** data for missing labels using logic (highly relevant
1193+
to working with time series data)
11951194

11961195
Here is a simple example:
11971196

@@ -1911,10 +1910,10 @@ the axis indexes, since they are immutable) and returns a new object. Note that
19111910
**it is seldom necessary to copy objects**. For example, there are only a
19121911
handful of ways to alter a DataFrame *in-place*:
19131912

1914-
* Inserting, deleting, or modifying a column.
1915-
* Assigning to the ``index`` or ``columns`` attributes.
1916-
* For homogeneous data, directly modifying the values via the ``values``
1917-
attribute or advanced indexing.
1913+
* Inserting, deleting, or modifying a column.
1914+
* Assigning to the ``index`` or ``columns`` attributes.
1915+
* For homogeneous data, directly modifying the values via the ``values``
1916+
attribute or advanced indexing.
19181917

19191918
To be clear, no pandas method has the side effect of modifying your data;
19201919
almost every method returns a new object, leaving the original object
@@ -2112,22 +2111,22 @@ Because the data was transposed the original inference stored all columns as obj
21122111
The following functions are available for one dimensional object arrays or scalars to perform
21132112
hard conversion of objects to a specified type:
21142113

2115-
- :meth:`~pandas.to_numeric` (conversion to numeric dtypes)
2114+
* :meth:`~pandas.to_numeric` (conversion to numeric dtypes)
21162115

21172116
.. ipython:: python
21182117
21192118
m = ['1.1', 2, 3]
21202119
pd.to_numeric(m)
21212120
2122-
- :meth:`~pandas.to_datetime` (conversion to datetime objects)
2121+
* :meth:`~pandas.to_datetime` (conversion to datetime objects)
21232122

21242123
.. ipython:: python
21252124
21262125
import datetime
21272126
m = ['2016-07-09', datetime.datetime(2016, 3, 2)]
21282127
pd.to_datetime(m)
21292128
2130-
- :meth:`~pandas.to_timedelta` (conversion to timedelta objects)
2129+
* :meth:`~pandas.to_timedelta` (conversion to timedelta objects)
21312130

21322131
.. ipython:: python
21332132

doc/source/categorical.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -542,11 +542,11 @@ Comparisons
542542

543543
Comparing categorical data with other objects is possible in three cases:
544544

545-
* Comparing equality (``==`` and ``!=``) to a list-like object (list, Series, array,
546-
...) of the same length as the categorical data.
547-
* All comparisons (``==``, ``!=``, ``>``, ``>=``, ``<``, and ``<=``) of categorical data to
548-
another categorical Series, when ``ordered==True`` and the `categories` are the same.
549-
* All comparisons of a categorical data to a scalar.
545+
* Comparing equality (``==`` and ``!=``) to a list-like object (list, Series, array,
546+
...) of the same length as the categorical data.
547+
* All comparisons (``==``, ``!=``, ``>``, ``>=``, ``<``, and ``<=``) of categorical data to
548+
another categorical Series, when ``ordered==True`` and the `categories` are the same.
549+
* All comparisons of a categorical data to a scalar.
550550

551551
All other comparisons, especially "non-equality" comparisons of two categoricals with different
552552
categories or a categorical with any list-like object, will raise a ``TypeError``.

doc/source/comparison_with_r.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ was started to provide a more detailed look at the `R language
1818
party libraries as they relate to ``pandas``. In comparisons with R and CRAN
1919
libraries, we care about the following things:
2020

21-
- **Functionality / flexibility**: what can/cannot be done with each tool
22-
- **Performance**: how fast are operations. Hard numbers/benchmarks are
23-
preferable
24-
- **Ease-of-use**: Is one tool easier/harder to use (you may have to be
25-
the judge of this, given side-by-side code comparisons)
21+
* **Functionality / flexibility**: what can/cannot be done with each tool
22+
* **Performance**: how fast are operations. Hard numbers/benchmarks are
23+
preferable
24+
* **Ease-of-use**: Is one tool easier/harder to use (you may have to be
25+
the judge of this, given side-by-side code comparisons)
2626

2727
This page is also here to offer a bit of a translation guide for users of these
2828
R packages.

doc/source/computation.rst

+23-23
Original file line numberDiff line numberDiff line change
@@ -344,20 +344,20 @@ The weights used in the window are specified by the ``win_type`` keyword.
344344
The list of recognized types are the `scipy.signal window functions
345345
<https://docs.scipy.org/doc/scipy/reference/signal.html#window-functions>`__:
346346

347-
- ``boxcar``
348-
- ``triang``
349-
- ``blackman``
350-
- ``hamming``
351-
- ``bartlett``
352-
- ``parzen``
353-
- ``bohman``
354-
- ``blackmanharris``
355-
- ``nuttall``
356-
- ``barthann``
357-
- ``kaiser`` (needs beta)
358-
- ``gaussian`` (needs std)
359-
- ``general_gaussian`` (needs power, width)
360-
- ``slepian`` (needs width).
347+
* ``boxcar``
348+
* ``triang``
349+
* ``blackman``
350+
* ``hamming``
351+
* ``bartlett``
352+
* ``parzen``
353+
* ``bohman``
354+
* ``blackmanharris``
355+
* ``nuttall``
356+
* ``barthann``
357+
* ``kaiser`` (needs beta)
358+
* ``gaussian`` (needs std)
359+
* ``general_gaussian`` (needs power, width)
360+
* ``slepian`` (needs width).
361361

362362
.. ipython:: python
363363
@@ -537,10 +537,10 @@ Binary Window Functions
537537
two ``Series`` or any combination of ``DataFrame/Series`` or
538538
``DataFrame/DataFrame``. Here is the behavior in each case:
539539

540-
- two ``Series``: compute the statistic for the pairing.
541-
- ``DataFrame/Series``: compute the statistics for each column of the DataFrame
540+
* two ``Series``: compute the statistic for the pairing.
541+
* ``DataFrame/Series``: compute the statistics for each column of the DataFrame
542542
with the passed Series, thus returning a DataFrame.
543-
- ``DataFrame/DataFrame``: by default compute the statistic for matching column
543+
* ``DataFrame/DataFrame``: by default compute the statistic for matching column
544544
names, returning a DataFrame. If the keyword argument ``pairwise=True`` is
545545
passed then computes the statistic for each pair of columns, returning a
546546
``MultiIndexed DataFrame`` whose ``index`` are the dates in question (see :ref:`the next section
@@ -741,10 +741,10 @@ Aside from not having a ``window`` parameter, these functions have the same
741741
interfaces as their ``.rolling`` counterparts. Like above, the parameters they
742742
all accept are:
743743

744-
- ``min_periods``: threshold of non-null data points to require. Defaults to
744+
* ``min_periods``: threshold of non-null data points to require. Defaults to
745745
minimum needed to compute statistic. No ``NaNs`` will be output once
746746
``min_periods`` non-null data points have been seen.
747-
- ``center``: boolean, whether to set the labels at the center (default is False).
747+
* ``center``: boolean, whether to set the labels at the center (default is False).
748748

749749
.. _stats.moments.expanding.note:
750750
.. note::
@@ -903,12 +903,12 @@ of an EW moment:
903903
One must specify precisely one of **span**, **center of mass**, **half-life**
904904
and **alpha** to the EW functions:
905905

906-
- **Span** corresponds to what is commonly called an "N-day EW moving average".
907-
- **Center of mass** has a more physical interpretation and can be thought of
906+
* **Span** corresponds to what is commonly called an "N-day EW moving average".
907+
* **Center of mass** has a more physical interpretation and can be thought of
908908
in terms of span: :math:`c = (s - 1) / 2`.
909-
- **Half-life** is the period of time for the exponential weight to reduce to
909+
* **Half-life** is the period of time for the exponential weight to reduce to
910910
one half.
911-
- **Alpha** specifies the smoothing factor directly.
911+
* **Alpha** specifies the smoothing factor directly.
912912

913913
Here is an example for a univariate time series:
914914

doc/source/contributing.rst

+35-35
Original file line numberDiff line numberDiff line change
@@ -138,11 +138,11 @@ steps; you only need to install the compiler.
138138

139139
For Windows developers, the following links may be helpful.
140140

141-
- https://blogs.msdn.microsoft.com/pythonengineering/2016/04/11/unable-to-find-vcvarsall-bat/
142-
- https://github.com/conda/conda-recipes/wiki/Building-from-Source-on-Windows-32-bit-and-64-bit
143-
- https://cowboyprogrammer.org/building-python-wheels-for-windows/
144-
- https://blog.ionelmc.ro/2014/12/21/compiling-python-extensions-on-windows/
145-
- https://support.enthought.com/hc/en-us/articles/204469260-Building-Python-extensions-with-Canopy
141+
* https://blogs.msdn.microsoft.com/pythonengineering/2016/04/11/unable-to-find-vcvarsall-bat/
142+
* https://github.com/conda/conda-recipes/wiki/Building-from-Source-on-Windows-32-bit-and-64-bit
143+
* https://cowboyprogrammer.org/building-python-wheels-for-windows/
144+
* https://blog.ionelmc.ro/2014/12/21/compiling-python-extensions-on-windows/
145+
* https://support.enthought.com/hc/en-us/articles/204469260-Building-Python-extensions-with-Canopy
146146

147147
Let us know if you have any difficulties by opening an issue or reaching out on
148148
`Gitter`_.
@@ -155,11 +155,11 @@ Creating a Python Environment
155155
Now that you have a C compiler, create an isolated pandas development
156156
environment:
157157

158-
- Install either `Anaconda <https://www.anaconda.com/download/>`_ or `miniconda
158+
* Install either `Anaconda <https://www.anaconda.com/download/>`_ or `miniconda
159159
<https://conda.io/miniconda.html>`_
160-
- Make sure your conda is up to date (``conda update conda``)
161-
- Make sure that you have :ref:`cloned the repository <contributing.forking>`
162-
- ``cd`` to the *pandas* source directory
160+
* Make sure your conda is up to date (``conda update conda``)
161+
* Make sure that you have :ref:`cloned the repository <contributing.forking>`
162+
* ``cd`` to the *pandas* source directory
163163

164164
We'll now kick off a three-step process:
165165

@@ -286,15 +286,15 @@ complex changes to the documentation as well.
286286

287287
Some other important things to know about the docs:
288288

289-
- The *pandas* documentation consists of two parts: the docstrings in the code
289+
* The *pandas* documentation consists of two parts: the docstrings in the code
290290
itself and the docs in this folder ``pandas/doc/``.
291291

292292
The docstrings provide a clear explanation of the usage of the individual
293293
functions, while the documentation in this folder consists of tutorial-like
294294
overviews per topic together with some other information (what's new,
295295
installation, etc).
296296

297-
- The docstrings follow a pandas convention, based on the **Numpy Docstring
297+
* The docstrings follow a pandas convention, based on the **Numpy Docstring
298298
Standard**. Follow the :ref:`pandas docstring guide <docstring>` for detailed
299299
instructions on how to write a correct docstring.
300300

@@ -303,7 +303,7 @@ Some other important things to know about the docs:
303303

304304
contributing_docstring.rst
305305

306-
- The tutorials make heavy use of the `ipython directive
306+
* The tutorials make heavy use of the `ipython directive
307307
<http://matplotlib.org/sampledoc/ipython_directive.html>`_ sphinx extension.
308308
This directive lets you put code in the documentation which will be run
309309
during the doc build. For example::
@@ -324,7 +324,7 @@ Some other important things to know about the docs:
324324
doc build. This approach means that code examples will always be up to date,
325325
but it does make the doc building a bit more complex.
326326

327-
- Our API documentation in ``doc/source/api.rst`` houses the auto-generated
327+
* Our API documentation in ``doc/source/api.rst`` houses the auto-generated
328328
documentation from the docstrings. For classes, there are a few subtleties
329329
around controlling which methods and attributes have pages auto-generated.
330330

@@ -488,8 +488,8 @@ standard. Google provides an open source style checker called ``cpplint``, but w
488488
use a fork of it that can be found `here <https://github.com/cpplint/cpplint>`__.
489489
Here are *some* of the more common ``cpplint`` issues:
490490

491-
- we restrict line-length to 80 characters to promote readability
492-
- every header file must include a header guard to avoid name collisions if re-included
491+
* we restrict line-length to 80 characters to promote readability
492+
* every header file must include a header guard to avoid name collisions if re-included
493493

494494
:ref:`Continuous Integration <contributing.ci>` will run the
495495
`cpplint <https://pypi.org/project/cpplint>`_ tool
@@ -536,8 +536,8 @@ Python (PEP8)
536536
There are several tools to ensure you abide by this standard. Here are *some* of
537537
the more common ``PEP8`` issues:
538538

539-
- we restrict line-length to 79 characters to promote readability
540-
- passing arguments should have spaces after commas, e.g. ``foo(arg1, arg2, kw1='bar')``
539+
* we restrict line-length to 79 characters to promote readability
540+
* passing arguments should have spaces after commas, e.g. ``foo(arg1, arg2, kw1='bar')``
541541

542542
:ref:`Continuous Integration <contributing.ci>` will run
543543
the `flake8 <https://pypi.org/project/flake8>`_ tool
@@ -715,14 +715,14 @@ Using ``pytest``
715715

716716
Here is an example of a self-contained set of tests that illustrate multiple features that we like to use.
717717

718-
- functional style: tests are like ``test_*`` and *only* take arguments that are either fixtures or parameters
719-
- ``pytest.mark`` can be used to set metadata on test functions, e.g. ``skip`` or ``xfail``.
720-
- using ``parametrize``: allow testing of multiple cases
721-
- to set a mark on a parameter, ``pytest.param(..., marks=...)`` syntax should be used
722-
- ``fixture``, code for object construction, on a per-test basis
723-
- using bare ``assert`` for scalars and truth-testing
724-
- ``tm.assert_series_equal`` (and its counter part ``tm.assert_frame_equal``), for pandas object comparisons.
725-
- the typical pattern of constructing an ``expected`` and comparing versus the ``result``
718+
* functional style: tests are like ``test_*`` and *only* take arguments that are either fixtures or parameters
719+
* ``pytest.mark`` can be used to set metadata on test functions, e.g. ``skip`` or ``xfail``.
720+
* using ``parametrize``: allow testing of multiple cases
721+
* to set a mark on a parameter, ``pytest.param(..., marks=...)`` syntax should be used
722+
* ``fixture``, code for object construction, on a per-test basis
723+
* using bare ``assert`` for scalars and truth-testing
724+
* ``tm.assert_series_equal`` (and its counter part ``tm.assert_frame_equal``), for pandas object comparisons.
725+
* the typical pattern of constructing an ``expected`` and comparing versus the ``result``
726726

727727
We would name this file ``test_cool_feature.py`` and put in an appropriate place in the ``pandas/tests/`` structure.
728728

@@ -969,21 +969,21 @@ Finally, commit your changes to your local repository with an explanatory messag
969969
uses a convention for commit message prefixes and layout. Here are
970970
some common prefixes along with general guidelines for when to use them:
971971
972-
* ENH: Enhancement, new functionality
973-
* BUG: Bug fix
974-
* DOC: Additions/updates to documentation
975-
* TST: Additions/updates to tests
976-
* BLD: Updates to the build process/scripts
977-
* PERF: Performance improvement
978-
* CLN: Code cleanup
972+
* ENH: Enhancement, new functionality
973+
* BUG: Bug fix
974+
* DOC: Additions/updates to documentation
975+
* TST: Additions/updates to tests
976+
* BLD: Updates to the build process/scripts
977+
* PERF: Performance improvement
978+
* CLN: Code cleanup
979979
980980
The following defines how a commit message should be structured. Please reference the
981981
relevant GitHub issues in your commit message using GH1234 or #1234. Either style
982982
is fine, but the former is generally preferred:
983983
984-
* a subject line with `< 80` chars.
985-
* One blank line.
986-
* Optionally, a commit message body.
984+
* a subject line with `< 80` chars.
985+
* One blank line.
986+
* Optionally, a commit message body.
987987
988988
Now you can commit your changes in your local repository::
989989

0 commit comments

Comments
 (0)