Skip to content

Commit cfcb002

Browse files
committed
Merge branch 'master' of https://github.com/pandas-dev/pandas into tslibs-parsing
2 parents 6db3e3c + 328c7e1 commit cfcb002

File tree

100 files changed

+2129
-884
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

100 files changed

+2129
-884
lines changed

appveyor.yml

+6
Original file line numberDiff line numberDiff line change
@@ -74,12 +74,18 @@ install:
7474
# create our env
7575
- cmd: conda create -n pandas python=%PYTHON_VERSION% cython pytest>=3.1.0 pytest-xdist
7676
- cmd: activate pandas
77+
- cmd: pip install moto
7778
- SET REQ=ci\requirements-%PYTHON_VERSION%_WIN.run
7879
- cmd: echo "installing requirements from %REQ%"
7980
- cmd: conda install -n pandas --file=%REQ%
8081
- cmd: conda list -n pandas
8182
- cmd: echo "installing requirements from %REQ% - done"
8283

84+
# add some pip only reqs to the env
85+
- SET REQ=ci\requirements-%PYTHON_VERSION%_WIN.pip
86+
- cmd: echo "installing requirements from %REQ%"
87+
- cmd: pip install -Ur %REQ%
88+
8389
# build em using the local source checkout in the correct windows env
8490
- cmd: '%CMD_IN_ENV% python setup.py build_ext --inplace'
8591

asv_bench/benchmarks/categoricals.py

+3
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@ def time_value_counts_dropna(self):
6767
def time_rendering(self):
6868
str(self.sel)
6969

70+
def time_set_categories(self):
71+
self.ts.cat.set_categories(self.ts.cat.categories[::2])
72+
7073

7174
class Categoricals3(object):
7275
goal_time = 0.2

asv_bench/benchmarks/period.py

+59
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,65 @@ def time_value_counts_pindex(self):
7878
self.i.value_counts()
7979

8080

81+
class Properties(object):
82+
def setup(self):
83+
self.per = Period('2017-09-06 08:28', freq='min')
84+
85+
def time_year(self):
86+
self.per.year
87+
88+
def time_month(self):
89+
self.per.month
90+
91+
def time_day(self):
92+
self.per.day
93+
94+
def time_hour(self):
95+
self.per.hour
96+
97+
def time_minute(self):
98+
self.per.minute
99+
100+
def time_second(self):
101+
self.per.second
102+
103+
def time_is_leap_year(self):
104+
self.per.is_leap_year
105+
106+
def time_quarter(self):
107+
self.per.quarter
108+
109+
def time_qyear(self):
110+
self.per.qyear
111+
112+
def time_week(self):
113+
self.per.week
114+
115+
def time_daysinmonth(self):
116+
self.per.daysinmonth
117+
118+
def time_dayofweek(self):
119+
self.per.dayofweek
120+
121+
def time_dayofyear(self):
122+
self.per.dayofyear
123+
124+
def time_start_time(self):
125+
self.per.start_time
126+
127+
def time_end_time(self):
128+
self.per.end_time
129+
130+
def time_to_timestamp():
131+
self.per.to_timestamp()
132+
133+
def time_now():
134+
self.per.now()
135+
136+
def time_asfreq():
137+
self.per.asfreq('A')
138+
139+
81140
class period_standard_indexing(object):
82141
goal_time = 0.2
83142

ci/install_circle.sh

+1
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ time conda create -n pandas -q --file=${REQ_BUILD} || exit 1
6767
time conda install -n pandas pytest>=3.1.0 || exit 1
6868

6969
source activate pandas
70+
time pip install moto || exit 1
7071

7172
# build but don't install
7273
echo "[build em]"

ci/install_travis.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ if [ -e ${REQ} ]; then
104104
fi
105105

106106
time conda install -n pandas pytest>=3.1.0
107-
time pip install pytest-xdist
107+
time pip install pytest-xdist moto
108108

109109
if [ "$LINT" ]; then
110110
conda install flake8

ci/requirements-2.7_WIN.pip

Whitespace-only changes.

ci/requirements-3.6_NUMPY_DEV.pip

Whitespace-only changes.

ci/requirements-3.6_WIN.pip

Whitespace-only changes.

ci/requirements_dev.txt

+1
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@ cython
55
pytest>=3.1.0
66
pytest-cov
77
flake8
8+
moto

doc/source/advanced.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -625,7 +625,7 @@ Index Types
625625
We have discussed ``MultiIndex`` in the previous sections pretty extensively. ``DatetimeIndex`` and ``PeriodIndex``
626626
are shown :ref:`here <timeseries.overview>`. ``TimedeltaIndex`` are :ref:`here <timedeltas.timedeltas>`.
627627

628-
In the following sub-sections we will highlite some other index types.
628+
In the following sub-sections we will highlight some other index types.
629629

630630
.. _indexing.categoricalindex:
631631

@@ -645,7 +645,7 @@ and allows efficient indexing and storage of an index with a large number of dup
645645
df.dtypes
646646
df.B.cat.categories
647647
648-
Setting the index, will create create a ``CategoricalIndex``
648+
Setting the index, will create a ``CategoricalIndex``
649649

650650
.. ipython:: python
651651
@@ -681,7 +681,7 @@ Groupby operations on the index will preserve the index nature as well
681681
Reindexing operations, will return a resulting index based on the type of the passed
682682
indexer, meaning that passing a list will return a plain-old-``Index``; indexing with
683683
a ``Categorical`` will return a ``CategoricalIndex``, indexed according to the categories
684-
of the PASSED ``Categorical`` dtype. This allows one to arbitrarly index these even with
684+
of the PASSED ``Categorical`` dtype. This allows one to arbitrarily index these even with
685685
values NOT in the categories, similarly to how you can reindex ANY pandas index.
686686

687687
.. ipython :: python
@@ -722,7 +722,7 @@ Int64Index and RangeIndex
722722
Prior to 0.18.0, the ``Int64Index`` would provide the default index for all ``NDFrame`` objects.
723723
724724
``RangeIndex`` is a sub-class of ``Int64Index`` added in version 0.18.0, now providing the default index for all ``NDFrame`` objects.
725-
``RangeIndex`` is an optimized version of ``Int64Index`` that can represent a monotonic ordered set. These are analagous to python `range types <https://docs.python.org/3/library/stdtypes.html#typesseq-range>`__.
725+
``RangeIndex`` is an optimized version of ``Int64Index`` that can represent a monotonic ordered set. These are analogous to python `range types <https://docs.python.org/3/library/stdtypes.html#typesseq-range>`__.
726726
727727
.. _indexing.float64index:
728728
@@ -963,7 +963,7 @@ index can be somewhat complicated. For example, the following does not work:
963963
s.loc['c':'e'+1]
964964
965965
A very common use case is to limit a time series to start and end at two
966-
specific dates. To enable this, we made the design design to make label-based
966+
specific dates. To enable this, we made the design to make label-based
967967
slicing include both endpoints:
968968
969969
.. ipython:: python

doc/source/api.rst

+11-1
Original file line numberDiff line numberDiff line change
@@ -218,10 +218,19 @@ Top-level dealing with datetimelike
218218
to_timedelta
219219
date_range
220220
bdate_range
221+
cdate_range
221222
period_range
222223
timedelta_range
223224
infer_freq
224225

226+
Top-level dealing with intervals
227+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
228+
229+
.. autosummary::
230+
:toctree: generated/
231+
232+
interval_range
233+
225234
Top-level evaluation
226235
~~~~~~~~~~~~~~~~~~~~
227236

@@ -1282,7 +1291,7 @@ Index
12821291
-----
12831292

12841293
**Many of these methods or variants thereof are available on the objects
1285-
that contain an index (Series/Dataframe) and those should most likely be
1294+
that contain an index (Series/DataFrame) and those should most likely be
12861295
used before calling these methods directly.**
12871296

12881297
.. autosummary::
@@ -2062,6 +2071,7 @@ Style Application
20622071

20632072
Styler.apply
20642073
Styler.applymap
2074+
Styler.where
20652075
Styler.format
20662076
Styler.set_precision
20672077
Styler.set_table_styles

doc/source/basics.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -923,7 +923,7 @@ Passing a named function will yield that name for the row:
923923
Aggregating with a dict
924924
+++++++++++++++++++++++
925925

926-
Passing a dictionary of column names to a scalar or a list of scalars, to ``DataFame.agg``
926+
Passing a dictionary of column names to a scalar or a list of scalars, to ``DataFrame.agg``
927927
allows you to customize which functions are applied to which columns. Note that the results
928928
are not in any particular order, you can use an ``OrderedDict`` instead to guarantee ordering.
929929

doc/source/computation.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -654,7 +654,7 @@ aggregation with, outputting a DataFrame:
654654
655655
r['A'].agg([np.sum, np.mean, np.std])
656656
657-
On a widowed DataFrame, you can pass a list of functions to apply to each
657+
On a windowed DataFrame, you can pass a list of functions to apply to each
658658
column, which produces an aggregated result with a hierarchical index:
659659

660660
.. ipython:: python

doc/source/groupby.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -561,7 +561,7 @@ must be either implemented on GroupBy or available via :ref:`dispatching
561561
562562
.. note::
563563

564-
If you pass a dict to ``aggregate``, the ordering of the output colums is
564+
If you pass a dict to ``aggregate``, the ordering of the output columns is
565565
non-deterministic. If you want to be sure the output columns will be in a specific
566566
order, you can use an ``OrderedDict``. Compare the output of the following two commands:
567567

@@ -1211,7 +1211,7 @@ Groupby by Indexer to 'resample' data
12111211

12121212
Resampling produces new hypothetical samples (resamples) from already existing observed data or from a model that generates data. These new samples are similar to the pre-existing samples.
12131213

1214-
In order to resample to work on indices that are non-datetimelike , the following procedure can be utilized.
1214+
In order to resample to work on indices that are non-datetimelike, the following procedure can be utilized.
12151215

12161216
In the following examples, **df.index // 5** returns a binary array which is used to determine what gets selected for the groupby operation.
12171217

doc/source/indexing.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -714,7 +714,7 @@ Finally, one can also set a seed for ``sample``'s random number generator using
714714
Setting With Enlargement
715715
------------------------
716716

717-
The ``.loc/[]`` operations can perform enlargement when setting a non-existant key for that axis.
717+
The ``.loc/[]`` operations can perform enlargement when setting a non-existent key for that axis.
718718

719719
In the ``Series`` case this is effectively an appending operation
720720

doc/source/io.rst

+2-3
Original file line numberDiff line numberDiff line change
@@ -3077,7 +3077,7 @@ Compressed pickle files
30773077

30783078
.. versionadded:: 0.20.0
30793079

3080-
:func:`read_pickle`, :meth:`DataFame.to_pickle` and :meth:`Series.to_pickle` can read
3080+
:func:`read_pickle`, :meth:`DataFrame.to_pickle` and :meth:`Series.to_pickle` can read
30813081
and write compressed pickle files. The compression types of ``gzip``, ``bz2``, ``xz`` are supported for reading and writing.
30823082
`zip`` file supports read only and must contain only one data file
30833083
to be read in.
@@ -4515,8 +4515,7 @@ See the documentation for `pyarrow <http://arrow.apache.org/docs/python/>`__ and
45154515
'd': np.arange(4.0, 7.0, dtype='float64'),
45164516
'e': [True, False, True],
45174517
'f': pd.date_range('20130101', periods=3),
4518-
'g': pd.date_range('20130101', periods=3, tz='US/Eastern'),
4519-
'h': pd.date_range('20130101', periods=3, freq='ns')})
4518+
'g': pd.date_range('20130101', periods=3, tz='US/Eastern')})
45204519
45214520
df
45224521
df.dtypes

doc/source/merging.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -1329,7 +1329,7 @@ By default we are taking the asof of the quotes.
13291329
on='time',
13301330
by='ticker')
13311331
1332-
We only asof within ``2ms`` betwen the quote time and the trade time.
1332+
We only asof within ``2ms`` between the quote time and the trade time.
13331333

13341334
.. ipython:: python
13351335
@@ -1338,8 +1338,8 @@ We only asof within ``2ms`` betwen the quote time and the trade time.
13381338
by='ticker',
13391339
tolerance=pd.Timedelta('2ms'))
13401340
1341-
We only asof within ``10ms`` betwen the quote time and the trade time and we exclude exact matches on time.
1342-
Note that though we exclude the exact matches (of the quotes), prior quotes DO propogate to that point
1341+
We only asof within ``10ms`` between the quote time and the trade time and we exclude exact matches on time.
1342+
Note that though we exclude the exact matches (of the quotes), prior quotes DO propagate to that point
13431343
in time.
13441344

13451345
.. ipython:: python

doc/source/missing_data.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -320,7 +320,7 @@ Interpolation
320320

321321
The ``limit_direction`` keyword argument was added.
322322

323-
Both Series and Dataframe objects have an ``interpolate`` method that, by default,
323+
Both Series and DataFrame objects have an ``interpolate`` method that, by default,
324324
performs linear interpolation at missing datapoints.
325325

326326
.. ipython:: python

doc/source/options.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -313,9 +313,9 @@ display.large_repr truncate For DataFrames exceeding max_ro
313313
display.latex.repr False Whether to produce a latex DataFrame
314314
representation for jupyter frontends
315315
that support it.
316-
display.latex.escape True Escapes special caracters in Dataframes, when
316+
display.latex.escape True Escapes special characters in DataFrames, when
317317
using the to_latex method.
318-
display.latex.longtable False Specifies if the to_latex method of a Dataframe
318+
display.latex.longtable False Specifies if the to_latex method of a DataFrame
319319
uses the longtable format.
320320
display.latex.multicolumn True Combines columns when using a MultiIndex
321321
display.latex.multicolumn_format 'l' Alignment of multicolumn labels

doc/source/reshaping.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ the level numbers:
156156
stacked.unstack('second')
157157
158158
Notice that the ``stack`` and ``unstack`` methods implicitly sort the index
159-
levels involved. Hence a call to ``stack`` and then ``unstack``, or viceversa,
159+
levels involved. Hence a call to ``stack`` and then ``unstack``, or vice versa,
160160
will result in a **sorted** copy of the original DataFrame or Series:
161161

162162
.. ipython:: python

doc/source/sparse.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ dtype, ``fill_value`` default changes:
132132
s.to_sparse()
133133
134134
You can change the dtype using ``.astype()``, the result is also sparse. Note that
135-
``.astype()`` also affects to the ``fill_value`` to keep its dense represantation.
135+
``.astype()`` also affects to the ``fill_value`` to keep its dense representation.
136136

137137

138138
.. ipython:: python

doc/source/style.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@
169169
"cell_type": "markdown",
170170
"metadata": {},
171171
"source": [
172-
"Notice the similarity with the standard `df.applymap`, which operates on DataFrames elementwise. We want you to be able to resuse your existing knowledge of how to interact with DataFrames.\n",
172+
"Notice the similarity with the standard `df.applymap`, which operates on DataFrames elementwise. We want you to be able to reuse your existing knowledge of how to interact with DataFrames.\n",
173173
"\n",
174174
"Notice also that our function returned a string containing the CSS attribute and value, separated by a colon just like in a `<style>` tag. This will be a common theme.\n",
175175
"\n",

doc/source/timeseries.rst

+18-9
Original file line numberDiff line numberDiff line change
@@ -1054,7 +1054,7 @@ as ``BusinessHour`` except that it skips specified custom holidays.
10541054
# Tuesday after MLK Day (Monday is skipped because it's a holiday)
10551055
dt + bhour_us * 2
10561056
1057-
You can use keyword arguments suported by either ``BusinessHour`` and ``CustomBusinessDay``.
1057+
You can use keyword arguments supported by either ``BusinessHour`` and ``CustomBusinessDay``.
10581058

10591059
.. ipython:: python
10601060
@@ -1088,7 +1088,7 @@ frequencies. We will refer to these aliases as *offset aliases*.
10881088
"BMS", "business month start frequency"
10891089
"CBMS", "custom business month start frequency"
10901090
"Q", "quarter end frequency"
1091-
"BQ", "business quarter endfrequency"
1091+
"BQ", "business quarter end frequency"
10921092
"QS", "quarter start frequency"
10931093
"BQS", "business quarter start frequency"
10941094
"A, Y", "year end frequency"
@@ -1132,13 +1132,13 @@ For some frequencies you can specify an anchoring suffix:
11321132
:header: "Alias", "Description"
11331133
:widths: 15, 100
11341134

1135-
"W\-SUN", "weekly frequency (sundays). Same as 'W'"
1136-
"W\-MON", "weekly frequency (mondays)"
1137-
"W\-TUE", "weekly frequency (tuesdays)"
1138-
"W\-WED", "weekly frequency (wednesdays)"
1139-
"W\-THU", "weekly frequency (thursdays)"
1140-
"W\-FRI", "weekly frequency (fridays)"
1141-
"W\-SAT", "weekly frequency (saturdays)"
1135+
"W\-SUN", "weekly frequency (Sundays). Same as 'W'"
1136+
"W\-MON", "weekly frequency (Mondays)"
1137+
"W\-TUE", "weekly frequency (Tuesdays)"
1138+
"W\-WED", "weekly frequency (Wednesdays)"
1139+
"W\-THU", "weekly frequency (Thursdays)"
1140+
"W\-FRI", "weekly frequency (Fridays)"
1141+
"W\-SAT", "weekly frequency (Saturdays)"
11421142
"(B)Q(S)\-DEC", "quarterly frequency, year ends in December. Same as 'Q'"
11431143
"(B)Q(S)\-JAN", "quarterly frequency, year ends in January"
11441144
"(B)Q(S)\-FEB", "quarterly frequency, year ends in February"
@@ -1705,6 +1705,15 @@ has multiplied span.
17051705
17061706
pd.PeriodIndex(start='2014-01', freq='3M', periods=4)
17071707
1708+
If ``start`` or ``end`` are ``Period`` objects, they will be used as anchor
1709+
endpoints for a ``PeriodIndex`` with frequency matching that of the
1710+
``PeriodIndex`` constructor.
1711+
1712+
.. ipython:: python
1713+
1714+
pd.PeriodIndex(start=pd.Period('2017Q1', freq='Q'),
1715+
end=pd.Period('2017Q2', freq='Q'), freq='M')
1716+
17081717
Just like ``DatetimeIndex``, a ``PeriodIndex`` can also be used to index pandas
17091718
objects:
17101719

doc/source/visualization.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -261,7 +261,7 @@ Histogram can be stacked by ``stacked=True``. Bin size can be changed by ``bins`
261261
262262
plt.close('all')
263263
264-
You can pass other keywords supported by matplotlib ``hist``. For example, horizontal and cumulative histgram can be drawn by ``orientation='horizontal'`` and ``cumulative='True'``.
264+
You can pass other keywords supported by matplotlib ``hist``. For example, horizontal and cumulative histogram can be drawn by ``orientation='horizontal'`` and ``cumulative=True``.
265265

266266
.. ipython:: python
267267

0 commit comments

Comments
 (0)