Skip to content

Commit c4802db

Browse files
committed
Merge remote-tracking branch 'upstream/master' into package-size
2 parents 84ccdbf + b36b451 commit c4802db

File tree

91 files changed

+2343
-1618
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

91 files changed

+2343
-1618
lines changed

asv_bench/benchmarks/categoricals.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,3 +193,20 @@ def time_categorical_series_is_monotonic_increasing(self):
193193

194194
def time_categorical_series_is_monotonic_decreasing(self):
195195
self.s.is_monotonic_decreasing
196+
197+
198+
class Contains(object):
199+
200+
goal_time = 0.2
201+
202+
def setup(self):
203+
N = 10**5
204+
self.ci = tm.makeCategoricalIndex(N)
205+
self.c = self.ci.values
206+
self.key = self.ci.categories[0]
207+
208+
def time_categorical_index_contains(self):
209+
self.key in self.ci
210+
211+
def time_categorical_contains(self):
212+
self.key in self.c

asv_bench/benchmarks/groupby.py

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
import numpy as np
77
from pandas import (DataFrame, Series, MultiIndex, date_range, period_range,
8-
TimeGrouper, Categorical)
8+
TimeGrouper, Categorical, Timestamp)
99
import pandas.util.testing as tm
1010

1111
from .pandas_vb_common import setup # noqa
@@ -385,6 +385,25 @@ def time_dtype_as_field(self, dtype, method, application):
385385
self.as_field_method()
386386

387387

388+
class RankWithTies(object):
389+
# GH 21237
390+
goal_time = 0.2
391+
param_names = ['dtype', 'tie_method']
392+
params = [['float64', 'float32', 'int64', 'datetime64'],
393+
['first', 'average', 'dense', 'min', 'max']]
394+
395+
def setup(self, dtype, tie_method):
396+
N = 10**4
397+
if dtype == 'datetime64':
398+
data = np.array([Timestamp("2011/01/01")] * N, dtype=dtype)
399+
else:
400+
data = np.array([1] * N, dtype=dtype)
401+
self.df = DataFrame({'values': data, 'key': ['foo'] * N})
402+
403+
def time_rank_ties(self, dtype, tie_method):
404+
self.df.groupby('key').rank(method=tie_method)
405+
406+
388407
class Float32(object):
389408
# GH 13335
390409
goal_time = 0.2

doc/source/_static/favicon.ico

3.81 KB
Binary file not shown.

doc/source/api.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1200,9 +1200,9 @@ Attributes and underlying data
12001200
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12011201
**Axes**
12021202

1203-
* **items**: axis 0; each item corresponds to a DataFrame contained inside
1204-
* **major_axis**: axis 1; the index (rows) of each of the DataFrames
1205-
* **minor_axis**: axis 2; the columns of each of the DataFrames
1203+
* **items**: axis 0; each item corresponds to a DataFrame contained inside
1204+
* **major_axis**: axis 1; the index (rows) of each of the DataFrames
1205+
* **minor_axis**: axis 2; the columns of each of the DataFrames
12061206

12071207
.. autosummary::
12081208
:toctree: generated/

doc/source/basics.rst

Lines changed: 21 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,8 @@ Attributes and the raw ndarray(s)
5050

5151
pandas objects have a number of attributes enabling you to access the metadata
5252

53-
* **shape**: gives the axis dimensions of the object, consistent with ndarray
54-
* Axis labels
55-
53+
* **shape**: gives the axis dimensions of the object, consistent with ndarray
54+
* Axis labels
5655
* **Series**: *index* (only axis)
5756
* **DataFrame**: *index* (rows) and *columns*
5857
* **Panel**: *items*, *major_axis*, and *minor_axis*
@@ -131,9 +130,9 @@ Flexible binary operations
131130
With binary operations between pandas data structures, there are two key points
132131
of interest:
133132

134-
* Broadcasting behavior between higher- (e.g. DataFrame) and
135-
lower-dimensional (e.g. Series) objects.
136-
* Missing data in computations.
133+
* Broadcasting behavior between higher- (e.g. DataFrame) and
134+
lower-dimensional (e.g. Series) objects.
135+
* Missing data in computations.
137136

138137
We will demonstrate how to manage these issues independently, though they can
139138
be handled simultaneously.
@@ -462,10 +461,10 @@ produce an object of the same size. Generally speaking, these methods take an
462461
**axis** argument, just like *ndarray.{sum, std, ...}*, but the axis can be
463462
specified by name or integer:
464463

465-
- **Series**: no axis argument needed
466-
- **DataFrame**: "index" (axis=0, default), "columns" (axis=1)
467-
- **Panel**: "items" (axis=0), "major" (axis=1, default), "minor"
468-
(axis=2)
464+
* **Series**: no axis argument needed
465+
* **DataFrame**: "index" (axis=0, default), "columns" (axis=1)
466+
* **Panel**: "items" (axis=0), "major" (axis=1, default), "minor"
467+
(axis=2)
469468

470469
For example:
471470

@@ -1187,11 +1186,11 @@ It is used to implement nearly all other features relying on label-alignment
11871186
functionality. To *reindex* means to conform the data to match a given set of
11881187
labels along a particular axis. This accomplishes several things:
11891188

1190-
* Reorders the existing data to match a new set of labels
1191-
* Inserts missing value (NA) markers in label locations where no data for
1192-
that label existed
1193-
* If specified, **fill** data for missing labels using logic (highly relevant
1194-
to working with time series data)
1189+
* Reorders the existing data to match a new set of labels
1190+
* Inserts missing value (NA) markers in label locations where no data for
1191+
that label existed
1192+
* If specified, **fill** data for missing labels using logic (highly relevant
1193+
to working with time series data)
11951194

11961195
Here is a simple example:
11971196

@@ -1911,10 +1910,10 @@ the axis indexes, since they are immutable) and returns a new object. Note that
19111910
**it is seldom necessary to copy objects**. For example, there are only a
19121911
handful of ways to alter a DataFrame *in-place*:
19131912

1914-
* Inserting, deleting, or modifying a column.
1915-
* Assigning to the ``index`` or ``columns`` attributes.
1916-
* For homogeneous data, directly modifying the values via the ``values``
1917-
attribute or advanced indexing.
1913+
* Inserting, deleting, or modifying a column.
1914+
* Assigning to the ``index`` or ``columns`` attributes.
1915+
* For homogeneous data, directly modifying the values via the ``values``
1916+
attribute or advanced indexing.
19181917

19191918
To be clear, no pandas method has the side effect of modifying your data;
19201919
almost every method returns a new object, leaving the original object
@@ -2112,22 +2111,22 @@ Because the data was transposed the original inference stored all columns as obj
21122111
The following functions are available for one dimensional object arrays or scalars to perform
21132112
hard conversion of objects to a specified type:
21142113

2115-
- :meth:`~pandas.to_numeric` (conversion to numeric dtypes)
2114+
* :meth:`~pandas.to_numeric` (conversion to numeric dtypes)
21162115

21172116
.. ipython:: python
21182117
21192118
m = ['1.1', 2, 3]
21202119
pd.to_numeric(m)
21212120
2122-
- :meth:`~pandas.to_datetime` (conversion to datetime objects)
2121+
* :meth:`~pandas.to_datetime` (conversion to datetime objects)
21232122

21242123
.. ipython:: python
21252124
21262125
import datetime
21272126
m = ['2016-07-09', datetime.datetime(2016, 3, 2)]
21282127
pd.to_datetime(m)
21292128
2130-
- :meth:`~pandas.to_timedelta` (conversion to timedelta objects)
2129+
* :meth:`~pandas.to_timedelta` (conversion to timedelta objects)
21312130

21322131
.. ipython:: python
21332132

doc/source/categorical.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -542,11 +542,11 @@ Comparisons
542542

543543
Comparing categorical data with other objects is possible in three cases:
544544

545-
* Comparing equality (``==`` and ``!=``) to a list-like object (list, Series, array,
546-
...) of the same length as the categorical data.
547-
* All comparisons (``==``, ``!=``, ``>``, ``>=``, ``<``, and ``<=``) of categorical data to
548-
another categorical Series, when ``ordered==True`` and the `categories` are the same.
549-
* All comparisons of a categorical data to a scalar.
545+
* Comparing equality (``==`` and ``!=``) to a list-like object (list, Series, array,
546+
...) of the same length as the categorical data.
547+
* All comparisons (``==``, ``!=``, ``>``, ``>=``, ``<``, and ``<=``) of categorical data to
548+
another categorical Series, when ``ordered==True`` and the `categories` are the same.
549+
* All comparisons of a categorical data to a scalar.
550550

551551
All other comparisons, especially "non-equality" comparisons of two categoricals with different
552552
categories or a categorical with any list-like object, will raise a ``TypeError``.

doc/source/comparison_with_r.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ was started to provide a more detailed look at the `R language
1818
party libraries as they relate to ``pandas``. In comparisons with R and CRAN
1919
libraries, we care about the following things:
2020

21-
- **Functionality / flexibility**: what can/cannot be done with each tool
22-
- **Performance**: how fast are operations. Hard numbers/benchmarks are
23-
preferable
24-
- **Ease-of-use**: Is one tool easier/harder to use (you may have to be
25-
the judge of this, given side-by-side code comparisons)
21+
* **Functionality / flexibility**: what can/cannot be done with each tool
22+
* **Performance**: how fast are operations. Hard numbers/benchmarks are
23+
preferable
24+
* **Ease-of-use**: Is one tool easier/harder to use (you may have to be
25+
the judge of this, given side-by-side code comparisons)
2626

2727
This page is also here to offer a bit of a translation guide for users of these
2828
R packages.

doc/source/computation.rst

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -344,20 +344,20 @@ The weights used in the window are specified by the ``win_type`` keyword.
344344
The list of recognized types are the `scipy.signal window functions
345345
<https://docs.scipy.org/doc/scipy/reference/signal.html#window-functions>`__:
346346

347-
- ``boxcar``
348-
- ``triang``
349-
- ``blackman``
350-
- ``hamming``
351-
- ``bartlett``
352-
- ``parzen``
353-
- ``bohman``
354-
- ``blackmanharris``
355-
- ``nuttall``
356-
- ``barthann``
357-
- ``kaiser`` (needs beta)
358-
- ``gaussian`` (needs std)
359-
- ``general_gaussian`` (needs power, width)
360-
- ``slepian`` (needs width).
347+
* ``boxcar``
348+
* ``triang``
349+
* ``blackman``
350+
* ``hamming``
351+
* ``bartlett``
352+
* ``parzen``
353+
* ``bohman``
354+
* ``blackmanharris``
355+
* ``nuttall``
356+
* ``barthann``
357+
* ``kaiser`` (needs beta)
358+
* ``gaussian`` (needs std)
359+
* ``general_gaussian`` (needs power, width)
360+
* ``slepian`` (needs width).
361361

362362
.. ipython:: python
363363
@@ -537,10 +537,10 @@ Binary Window Functions
537537
two ``Series`` or any combination of ``DataFrame/Series`` or
538538
``DataFrame/DataFrame``. Here is the behavior in each case:
539539

540-
- two ``Series``: compute the statistic for the pairing.
541-
- ``DataFrame/Series``: compute the statistics for each column of the DataFrame
540+
* two ``Series``: compute the statistic for the pairing.
541+
* ``DataFrame/Series``: compute the statistics for each column of the DataFrame
542542
with the passed Series, thus returning a DataFrame.
543-
- ``DataFrame/DataFrame``: by default compute the statistic for matching column
543+
* ``DataFrame/DataFrame``: by default compute the statistic for matching column
544544
names, returning a DataFrame. If the keyword argument ``pairwise=True`` is
545545
passed then computes the statistic for each pair of columns, returning a
546546
``MultiIndexed DataFrame`` whose ``index`` are the dates in question (see :ref:`the next section
@@ -741,10 +741,10 @@ Aside from not having a ``window`` parameter, these functions have the same
741741
interfaces as their ``.rolling`` counterparts. Like above, the parameters they
742742
all accept are:
743743

744-
- ``min_periods``: threshold of non-null data points to require. Defaults to
744+
* ``min_periods``: threshold of non-null data points to require. Defaults to
745745
minimum needed to compute statistic. No ``NaNs`` will be output once
746746
``min_periods`` non-null data points have been seen.
747-
- ``center``: boolean, whether to set the labels at the center (default is False).
747+
* ``center``: boolean, whether to set the labels at the center (default is False).
748748

749749
.. _stats.moments.expanding.note:
750750
.. note::
@@ -903,12 +903,12 @@ of an EW moment:
903903
One must specify precisely one of **span**, **center of mass**, **half-life**
904904
and **alpha** to the EW functions:
905905

906-
- **Span** corresponds to what is commonly called an "N-day EW moving average".
907-
- **Center of mass** has a more physical interpretation and can be thought of
906+
* **Span** corresponds to what is commonly called an "N-day EW moving average".
907+
* **Center of mass** has a more physical interpretation and can be thought of
908908
in terms of span: :math:`c = (s - 1) / 2`.
909-
- **Half-life** is the period of time for the exponential weight to reduce to
909+
* **Half-life** is the period of time for the exponential weight to reduce to
910910
one half.
911-
- **Alpha** specifies the smoothing factor directly.
911+
* **Alpha** specifies the smoothing factor directly.
912912

913913
Here is an example for a univariate time series:
914914

doc/source/conf.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -213,16 +213,16 @@
213213
# of the sidebar.
214214
# html_logo = None
215215

216-
# The name of an image file (within the static path) to use as favicon of the
217-
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
218-
# pixels large.
219-
# html_favicon = None
220-
221216
# Add any paths that contain custom static files (such as style sheets) here,
222217
# relative to this directory. They are copied after the builtin static files,
223218
# so a file named "default.css" will overwrite the builtin "default.css".
224219
html_static_path = ['_static']
225220

221+
# The name of an image file (within the static path) to use as favicon of the
222+
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
223+
# pixels large.
224+
html_favicon = os.path.join(html_static_path[0], 'favicon.ico')
225+
226226
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
227227
# using the given strftime format.
228228
# html_last_updated_fmt = '%b %d, %Y'

0 commit comments

Comments
 (0)