Skip to content

Commit a1127f6

Browse files
committed
Merge branch 'master' into excel_style
2 parents 306eebe + d60f490 commit a1127f6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+5486
-3550
lines changed

.travis.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ python: 3.5
77
# set NOCACHE-true
88
# To delete caches go to https://travis-ci.org/OWNER/REPOSITORY/caches or run
99
# travis cache --delete inside the project directory from the travis command line client
10-
# The cash directories will be deleted if anything in ci/ changes in a commit
10+
# The cache directories will be deleted if anything in ci/ changes in a commit
1111
cache:
1212
ccache: true
1313
directories:

asv_bench/benchmarks/timeseries.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -292,7 +292,10 @@ def setup(self):
292292
self.rng3 = date_range(start='1/1/2000', periods=1500000, freq='S')
293293
self.ts3 = Series(1, index=self.rng3)
294294

295-
def time_sort_index(self):
295+
def time_sort_index_monotonic(self):
296+
self.ts2.sort_index()
297+
298+
def time_sort_index_non_monotonic(self):
296299
self.ts.sort_index()
297300

298301
def time_timeseries_slice_minutely(self):

ci/requirements-3.5_DOC.run

+1
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,5 @@ sqlalchemy
1818
numexpr
1919
bottleneck
2020
statsmodels
21+
xarray
2122
pyqt=4.11.4

doc/source/advanced.rst

+35-30
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ data with an arbitrary number of dimensions in lower dimensional data
4646
structures like Series (1d) and DataFrame (2d).
4747

4848
In this section, we will show what exactly we mean by "hierarchical" indexing
49-
and how it integrates with the all of the pandas indexing functionality
49+
and how it integrates with all of the pandas indexing functionality
5050
described above and in prior sections. Later, when discussing :ref:`group by
5151
<groupby>` and :ref:`pivoting and reshaping data <reshaping>`, we'll show
5252
non-trivial applications to illustrate how it aids in structuring data for
@@ -136,7 +136,7 @@ can find yourself working with hierarchically-indexed data without creating a
136136
may wish to generate your own ``MultiIndex`` when preparing the data set.
137137

138138
Note that how the index is displayed by be controlled using the
139-
``multi_sparse`` option in ``pandas.set_printoptions``:
139+
``multi_sparse`` option in ``pandas.set_options()``:
140140

141141
.. ipython:: python
142142
@@ -175,35 +175,40 @@ completely analogous way to selecting a column in a regular DataFrame:
175175
See :ref:`Cross-section with hierarchical index <advanced.xs>` for how to select
176176
on a deeper level.
177177

178-
.. note::
178+
.. _advanced.shown_levels:
179+
180+
Defined Levels
181+
~~~~~~~~~~~~~~
182+
183+
The repr of a ``MultiIndex`` shows ALL the defined levels of an index, even
184+
if the they are not actually used. When slicing an index, you may notice this.
185+
For example:
179186

180-
The repr of a ``MultiIndex`` shows ALL the defined levels of an index, even
181-
if the they are not actually used. When slicing an index, you may notice this.
182-
For example:
187+
.. ipython:: python
183188
184-
.. ipython:: python
189+
# original multi-index
190+
df.columns
185191
186-
# original multi-index
187-
df.columns
192+
# sliced
193+
df[['foo','qux']].columns
188194
189-
# sliced
190-
df[['foo','qux']].columns
195+
This is done to avoid a recomputation of the levels in order to make slicing
196+
highly performant. If you want to see the actual used levels.
191197

192-
This is done to avoid a recomputation of the levels in order to make slicing
193-
highly performant. If you want to see the actual used levels.
198+
.. ipython:: python
194199
195-
.. ipython:: python
200+
df[['foo','qux']].columns.values
196201
197-
df[['foo','qux']].columns.values
202+
# for a specific level
203+
df[['foo','qux']].columns.get_level_values(0)
198204
199-
# for a specific level
200-
df[['foo','qux']].columns.get_level_values(0)
205+
To reconstruct the multiindex with only the used levels
201206

202-
To reconstruct the multiindex with only the used levels
207+
.. versionadded:: 0.20.0
203208

204-
.. ipython:: python
209+
.. ipython:: python
205210
206-
pd.MultiIndex.from_tuples(df[['foo','qux']].columns.values)
211+
df[['foo','qux']].columns.remove_unused_levels()
207212
208213
Data alignment and using ``reindex``
209214
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -288,7 +293,7 @@ As usual, **both sides** of the slicers are included as this is label indexing.
288293

289294
.. code-block:: python
290295
291-
df.loc[(slice('A1','A3'),.....),:]
296+
df.loc[(slice('A1','A3'),.....), :]
292297
293298
rather than this:
294299

@@ -317,51 +322,51 @@ Basic multi-index slicing using slices, lists, and labels.
317322

318323
.. ipython:: python
319324
320-
dfmi.loc[(slice('A1','A3'),slice(None), ['C1','C3']),:]
325+
dfmi.loc[(slice('A1','A3'), slice(None), ['C1', 'C3']), :]
321326
322327
You can use a ``pd.IndexSlice`` to have a more natural syntax using ``:`` rather than using ``slice(None)``
323328

324329
.. ipython:: python
325330
326331
idx = pd.IndexSlice
327-
dfmi.loc[idx[:,:,['C1','C3']],idx[:,'foo']]
332+
dfmi.loc[idx[:, :, ['C1', 'C3']], idx[:, 'foo']]
328333
329334
It is possible to perform quite complicated selections using this method on multiple
330335
axes at the same time.
331336

332337
.. ipython:: python
333338
334-
dfmi.loc['A1',(slice(None),'foo')]
335-
dfmi.loc[idx[:,:,['C1','C3']],idx[:,'foo']]
339+
dfmi.loc['A1', (slice(None), 'foo')]
340+
dfmi.loc[idx[:, :, ['C1', 'C3']], idx[:, 'foo']]
336341
337342
Using a boolean indexer you can provide selection related to the *values*.
338343

339344
.. ipython:: python
340345
341-
mask = dfmi[('a','foo')]>200
342-
dfmi.loc[idx[mask,:,['C1','C3']],idx[:,'foo']]
346+
mask = dfmi[('a', 'foo')] > 200
347+
dfmi.loc[idx[mask, :, ['C1', 'C3']], idx[:, 'foo']]
343348
344349
You can also specify the ``axis`` argument to ``.loc`` to interpret the passed
345350
slicers on a single axis.
346351

347352
.. ipython:: python
348353
349-
dfmi.loc(axis=0)[:,:,['C1','C3']]
354+
dfmi.loc(axis=0)[:, :, ['C1', 'C3']]
350355
351356
Furthermore you can *set* the values using these methods
352357

353358
.. ipython:: python
354359
355360
df2 = dfmi.copy()
356-
df2.loc(axis=0)[:,:,['C1','C3']] = -10
361+
df2.loc(axis=0)[:, :, ['C1', 'C3']] = -10
357362
df2
358363
359364
You can use a right-hand-side of an alignable object as well.
360365

361366
.. ipython:: python
362367
363368
df2 = dfmi.copy()
364-
df2.loc[idx[:,:,['C1','C3']],:] = df2*1000
369+
df2.loc[idx[:, :, ['C1', 'C3']], :] = df2 * 1000
365370
df2
366371
367372
.. _advanced.xs:

doc/source/api.rst

+1
Original file line numberDiff line numberDiff line change
@@ -1432,6 +1432,7 @@ MultiIndex Components
14321432
MultiIndex.droplevel
14331433
MultiIndex.swaplevel
14341434
MultiIndex.reorder_levels
1435+
MultiIndex.remove_unused_levels
14351436

14361437
.. _api.datetimeindex:
14371438

doc/source/computation.rst

+16-6
Original file line numberDiff line numberDiff line change
@@ -505,13 +505,18 @@ two ``Series`` or any combination of ``DataFrame/Series`` or
505505
- ``DataFrame/DataFrame``: by default compute the statistic for matching column
506506
names, returning a DataFrame. If the keyword argument ``pairwise=True`` is
507507
passed then computes the statistic for each pair of columns, returning a
508-
``Panel`` whose ``items`` are the dates in question (see :ref:`the next section
508+
``MultiIndexed DataFrame`` whose ``index`` are the dates in question (see :ref:`the next section
509509
<stats.moments.corr_pairwise>`).
510510

511511
For example:
512512

513513
.. ipython:: python
514514
515+
df = pd.DataFrame(np.random.randn(1000, 4),
516+
index=pd.date_range('1/1/2000', periods=1000),
517+
columns=['A', 'B', 'C', 'D'])
518+
df = df.cumsum()
519+
515520
df2 = df[:20]
516521
df2.rolling(window=5).corr(df2['B'])
517522
@@ -520,11 +525,16 @@ For example:
520525
Computing rolling pairwise covariances and correlations
521526
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
522527

528+
.. warning::
529+
530+
Prior to version 0.20.0 if ``pairwise=True`` was passed, a ``Panel`` would be returned.
531+
This will now return a 2-level MultiIndexed DataFrame, see the whatsnew :ref:`here <whatsnew_0200.api_breaking.rolling_pairwise>`
532+
523533
In financial data analysis and other fields it's common to compute covariance
524534
and correlation matrices for a collection of time series. Often one is also
525535
interested in moving-window covariance and correlation matrices. This can be
526536
done by passing the ``pairwise`` keyword argument, which in the case of
527-
``DataFrame`` inputs will yield a ``Panel`` whose ``items`` are the dates in
537+
``DataFrame`` inputs will yield a MultiIndexed ``DataFrame`` whose ``index`` are the dates in
528538
question. In the case of a single DataFrame argument the ``pairwise`` argument
529539
can even be omitted:
530540

@@ -539,15 +549,15 @@ can even be omitted:
539549
.. ipython:: python
540550
541551
covs = df[['B','C','D']].rolling(window=50).cov(df[['A','B','C']], pairwise=True)
542-
covs[df.index[-50]]
552+
covs.loc['2002-09-22':]
543553
544554
.. ipython:: python
545555
546556
correls = df.rolling(window=50).corr()
547-
correls[df.index[-50]]
557+
correls.loc['2002-09-22':]
548558
549559
You can efficiently retrieve the time series of correlations between two
550-
columns using ``.loc`` indexing:
560+
columns by reshaping and indexing:
551561

552562
.. ipython:: python
553563
:suppress:
@@ -557,7 +567,7 @@ columns using ``.loc`` indexing:
557567
.. ipython:: python
558568
559569
@savefig rolling_corr_pairwise_ex.png
560-
correls.loc[:, 'A', 'C'].plot()
570+
correls.unstack(1)[('A', 'C')].plot()
561571
562572
.. _stats.aggregate:
563573

doc/source/dsintro.rst

+55
Original file line numberDiff line numberDiff line change
@@ -763,6 +763,11 @@ completion mechanism so they can be tab-completed:
763763
Panel
764764
-----
765765

766+
.. warning::
767+
768+
In 0.20.0, ``Panel`` is deprecated and will be removed in
769+
a future version. See the section :ref:`Deprecate Panel <dsintro.deprecate_panel>`.
770+
766771
Panel is a somewhat less-used, but still important container for 3-dimensional
767772
data. The term `panel data <http://en.wikipedia.org/wiki/Panel_data>`__ is
768773
derived from econometrics and is partially responsible for the name pandas:
@@ -783,6 +788,7 @@ From 3D ndarray with optional axis labels
783788
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
784789

785790
.. ipython:: python
791+
:okwarning:
786792
787793
wp = pd.Panel(np.random.randn(2, 5, 4), items=['Item1', 'Item2'],
788794
major_axis=pd.date_range('1/1/2000', periods=5),
@@ -794,6 +800,7 @@ From dict of DataFrame objects
794800
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
795801

796802
.. ipython:: python
803+
:okwarning:
797804
798805
data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)),
799806
'Item2' : pd.DataFrame(np.random.randn(4, 2))}
@@ -816,6 +823,7 @@ dictionary of DataFrames as above, and the following named parameters:
816823
For example, compare to the construction above:
817824

818825
.. ipython:: python
826+
:okwarning:
819827
820828
pd.Panel.from_dict(data, orient='minor')
821829
@@ -824,6 +832,7 @@ DataFrame objects with mixed-type columns, all of the data will get upcasted to
824832
``dtype=object`` unless you pass ``orient='minor'``:
825833

826834
.. ipython:: python
835+
:okwarning:
827836
828837
df = pd.DataFrame({'a': ['foo', 'bar', 'baz'],
829838
'b': np.random.randn(3)})
@@ -851,6 +860,7 @@ This method was introduced in v0.7 to replace ``LongPanel.to_long``, and convert
851860
a DataFrame with a two-level index to a Panel.
852861

853862
.. ipython:: python
863+
:okwarning:
854864
855865
midx = pd.MultiIndex(levels=[['one', 'two'], ['x','y']], labels=[[1,1,0,0],[1,0,1,0]])
856866
df = pd.DataFrame({'A' : [1, 2, 3, 4], 'B': [5, 6, 7, 8]}, index=midx)
@@ -880,6 +890,7 @@ A Panel can be rearranged using its ``transpose`` method (which does not make a
880890
copy by default unless the data are heterogeneous):
881891

882892
.. ipython:: python
893+
:okwarning:
883894
884895
wp.transpose(2, 0, 1)
885896
@@ -909,6 +920,7 @@ Squeezing
909920
Another way to change the dimensionality of an object is to ``squeeze`` a 1-len object, similar to ``wp['Item1']``
910921

911922
.. ipython:: python
923+
:okwarning:
912924
913925
wp.reindex(items=['Item1']).squeeze()
914926
wp.reindex(items=['Item1'], minor=['B']).squeeze()
@@ -923,12 +935,55 @@ for more on this. To convert a Panel to a DataFrame, use the ``to_frame``
923935
method:
924936

925937
.. ipython:: python
938+
:okwarning:
926939
927940
panel = pd.Panel(np.random.randn(3, 5, 4), items=['one', 'two', 'three'],
928941
major_axis=pd.date_range('1/1/2000', periods=5),
929942
minor_axis=['a', 'b', 'c', 'd'])
930943
panel.to_frame()
931944
945+
946+
.. _dsintro.deprecate_panel:
947+
948+
Deprecate Panel
949+
---------------
950+
951+
Over the last few years, pandas has increased in both breadth and depth, with new features,
952+
datatype support, and manipulation routines. As a result, supporting efficient indexing and functional
953+
routines for ``Series``, ``DataFrame`` and ``Panel`` has contributed to an increasingly fragmented and
954+
difficult-to-understand codebase.
955+
956+
The 3-D structure of a ``Panel`` is much less common for many types of data analysis,
957+
than the 1-D of the ``Series`` or the 2-D of the ``DataFrame``. Going forward it makes sense for
958+
pandas to focus on these areas exclusively.
959+
960+
Oftentimes, one can simply use a MultiIndex ``DataFrame`` for easily working with higher dimensional data.
961+
962+
In additon, the ``xarray`` package was built from the ground up, specifically in order to
963+
support the multi-dimensional analysis that is one of ``Panel`` s main usecases.
964+
`Here is a link to the xarray panel-transition documentation <http://xarray.pydata.org/en/stable/pandas.html#panel-transition>`__.
965+
966+
.. ipython:: python
967+
:okwarning:
968+
969+
p = tm.makePanel()
970+
p
971+
972+
Convert to a MultiIndex DataFrame
973+
974+
.. ipython:: python
975+
:okwarning:
976+
977+
p.to_frame()
978+
979+
Alternatively, one can convert to an xarray ``DataArray``.
980+
981+
.. ipython:: python
982+
983+
p.to_xarray()
984+
985+
You can see the full-documentation for the `xarray package <http://xarray.pydata.org/en/stable/>`__.
986+
932987
.. _dsintro.panelnd:
933988
.. _dsintro.panel4d:
934989

doc/source/indexing.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ Different Choices for Indexing
6969
.. versionadded:: 0.11.0
7070

7171
Object selection has had a number of user-requested additions in order to
72-
support more explicit location based indexing. pandas now supports three types
72+
support more explicit location based indexing. Pandas now supports three types
7373
of multi-axis indexing.
7474

7575
- ``.loc`` is primarily label based, but may also be used with a boolean array. ``.loc`` will raise ``KeyError`` when the items are not found. Allowed inputs are:
@@ -401,7 +401,7 @@ Selection By Position
401401
This is sometimes called ``chained assignment`` and should be avoided.
402402
See :ref:`Returning a View versus Copy <indexing.view_versus_copy>`
403403

404-
pandas provides a suite of methods in order to get **purely integer based indexing**. The semantics follow closely python and numpy slicing. These are ``0-based`` indexing. When slicing, the start bounds is *included*, while the upper bound is *excluded*. Trying to use a non-integer, even a **valid** label will raise a ``IndexError``.
404+
Pandas provides a suite of methods in order to get **purely integer based indexing**. The semantics follow closely python and numpy slicing. These are ``0-based`` indexing. When slicing, the start bounds is *included*, while the upper bound is *excluded*. Trying to use a non-integer, even a **valid** label will raise an ``IndexError``.
405405

406406
The ``.iloc`` attribute is the primary access method. The following are valid inputs:
407407

0 commit comments

Comments
 (0)