Skip to content

Commit be8c77a

Browse files
committed
Merge commit 'v0.16.2-42-g383865f' into debian
* commit 'v0.16.2-42-g383865f': (72 commits) BUG: provide categorical concat always on axis 0, pandas-dev#10430 numpy 1.10 makes this an error for 1-d on axis != 0 DOC: update missing.rst with ref to groupby.rst BUG: Timedeltas with no specified units (and frac) should raise, pandas-dev#10426 BUG: using .loc[:,column] fails when the object is a multi-index, pandas-dev#10408 Removed scikit-timeseries migration docs from FAQ BUG: GH10395 bug in DataFrame.interpolate with axis=1 and inplace=True BUG: GH10392 bug where Table.select_column does not preserve column name TST: Use unicode literals in string test PERF: fix _get_level_indexer to accept an intermediate indexer result PERF: bench for pandas-dev#10287 BUG: drop_duplicates drops name(s). ENH: Enable ExcelWriter to construct in-memory sheets BLD: remove support for 3.2, pandas-dev#9118 PERF: timedelta and datetime64 ops improvements PERF: parse timedelta strings in cython pandas-dev#6755 closes bug in reset_index when index contains NaT Check for size=0 before setting item Fixes pandas-dev#10193 closes bug in apply when function returns categorical BUG: frequencies.get_freq_code raises an error against offset with n != 1 CI: run doc-tests always ...
2 parents 2b157b7 + 383865f commit be8c77a

File tree

116 files changed

+3143
-1370
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

116 files changed

+3143
-1370
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ doc/_build
4141
dist
4242
# Egg metadata
4343
*.egg-info
44+
.eggs
45+
4446
# tox testing tool
4547
.tox
4648
# rope

.travis.yml

+1-14
Original file line numberDiff line numberDiff line change
@@ -86,13 +86,6 @@ matrix:
8686
- CLIPBOARD=xsel
8787
- BUILD_TYPE=conda
8888
- JOB_NAME: "34_slow"
89-
- python: 3.2
90-
env:
91-
- NOSE_ARGS="not slow and not network and not disabled"
92-
- FULL_DEPS=true
93-
- CLIPBOARD_GUI=qt4
94-
- BUILD_TYPE=pydata
95-
- JOB_NAME: "32_nslow"
9689
- python: 2.7
9790
env:
9891
- EXPERIMENTAL=true
@@ -103,13 +96,6 @@ matrix:
10396
- BUILD_TYPE=pydata
10497
- PANDAS_TESTING_MODE="deprecate"
10598
allow_failures:
106-
- python: 3.2
107-
env:
108-
- NOSE_ARGS="not slow and not network and not disabled"
109-
- FULL_DEPS=true
110-
- CLIPBOARD_GUI=qt4
111-
- BUILD_TYPE=pydata
112-
- JOB_NAME: "32_nslow"
11399
- python: 2.7
114100
env:
115101
- NOSE_ARGS="slow and not network and not disabled"
@@ -180,6 +166,7 @@ before_script:
180166

181167
script:
182168
- echo "script"
169+
- ci/run_build_docs.sh &
183170
- ci/script.sh
184171
# nothing here, or failed tests won't fail travis
185172

README.md

+2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# pandas: powerful Python data analysis toolkit
22

33
[![Build Status](https://travis-ci.org/pydata/pandas.svg?branch=master)](https://travis-ci.org/pydata/pandas)
4+
[![Join the chat at
5+
https://gitter.im/pydata/pandas](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/pydata/pandas?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
46

57
## What is it
68

ci/build_docs.sh

+8-6
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/bash
22

3-
43
cd "$TRAVIS_BUILD_DIR"
4+
echo "inside $0"
55

66
git show --pretty="format:" --name-only HEAD~5.. --first-parent | grep -P "rst|txt|doc"
77

@@ -16,18 +16,20 @@ if [ x"$DOC_BUILD" != x"" ]; then
1616

1717
# we're running network tests, let's build the docs in the meantime
1818
echo "Will build docs"
19-
conda install sphinx==1.1.3 ipython
19+
conda install -n pandas sphinx=1.1.3 pygments ipython=2.4 --yes
20+
21+
source activate pandas
2022

2123
mv "$TRAVIS_BUILD_DIR"/doc /tmp
2224
cd /tmp/doc
2325

2426
rm /tmp/doc/source/api.rst # no R
2527
rm /tmp/doc/source/r_interface.rst # no R
2628

27-
echo ############################### > /tmp/doc.log
28-
echo # Log file for the doc build # > /tmp/doc.log
29-
echo ############################### > /tmp/doc.log
30-
echo "" > /tmp/doc.log
29+
echo ###############################
30+
echo # Log file for the doc build #
31+
echo ###############################
32+
3133
echo -e "y\n" | ./make.py --no-api 2>&1
3234

3335
cd /tmp/doc/build/html

ci/requirements-2.7_32.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
xlwt
44
numpy

ci/requirements-2.7_64.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
xlwt
44
numpy

ci/requirements-2.7_LOCALE.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz=2013b
33
xlwt=0.7.5
44
openpyxl=1.6.2

ci/requirements-2.7_SLOW.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
numpy
44
cython

ci/requirements-3.2.txt

-4
This file was deleted.

ci/requirements-3.3.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz=2013b
33
openpyxl=1.6.2
44
xlsxwriter=0.4.6

ci/requirements-3.4.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
openpyxl
44
xlsxwriter
55
xlrd
6+
xlwt
67
html5lib
78
patsy
89
beautiful-soup

ci/requirements-3.4_32.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
openpyxl
44
xlrd

ci/requirements-3.4_64.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
openpyxl
44
xlrd

ci/requirements-3.4_SLOW.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
openpyxl
44
xlsxwriter
55
xlrd
6+
xlwt
67
html5lib
78
patsy
89
beautiful-soup

ci/requirements_all.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
nose
22
sphinx
33
ipython
4-
dateutil
4+
python-dateutil
55
pytz
66
openpyxl
77
xlsxwriter

ci/requirements_dev.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
dateutil
1+
python-dateutil
22
pytz
33
numpy
44
cython

ci/run_build_docs.sh

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/bin/bash
2+
3+
echo "inside $0"
4+
5+
"$TRAVIS_BUILD_DIR"/ci/build_docs.sh 2>&1 > /tmp/doc.log &
6+
7+
# wait until subprocesses finish (build_docs.sh)
8+
wait
9+
10+
exit 0

ci/script.sh

+2-9
Original file line numberDiff line numberDiff line change
@@ -12,20 +12,13 @@ if [ -n "$LOCALE_OVERRIDE" ]; then
1212
python -c "$pycmd"
1313
fi
1414

15-
# conditionally build and upload docs to GH/pandas-docs/pandas-docs/travis
16-
"$TRAVIS_BUILD_DIR"/ci/build_docs.sh 2>&1 > /tmp/doc.log &
17-
# doc build log will be shown after tests
18-
1915
if [ "$BUILD_TEST" ]; then
2016
echo "We are not running nosetests as this is simply a build test."
2117
else
22-
echo nosetests --exe -A "$NOSE_ARGS" pandas --with-xunit --xunit-file=/tmp/nosetests.xml
23-
nosetests --exe -A "$NOSE_ARGS" pandas --with-xunit --xunit-file=/tmp/nosetests.xml
18+
echo nosetests --exe -A "$NOSE_ARGS" pandas --doctest-tests --with-xunit --xunit-file=/tmp/nosetests.xml
19+
nosetests --exe -A "$NOSE_ARGS" pandas --doctest-tests --with-xunit --xunit-file=/tmp/nosetests.xml
2420
fi
2521

2622
RET="$?"
2723

28-
# wait until subprocesses finish (build_docs.sh)
29-
wait
30-
3124
exit "$RET"

ci/submit_ccache.sh

+5
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,11 @@ fi
1313

1414
if [ "$IRON_TOKEN" ]; then
1515

16+
# install the compiler cache
17+
sudo apt-get $APT_ARGS install ccache p7zip-full
18+
# iron_cache, pending py3 fixes upstream
19+
pip install -I --allow-external --allow-insecure git+https://github.com/iron-io/iron_cache_python.git@8a451c7d7e4d16e0c3bedffd0f280d5d9bd4fe59#egg=iron_cache
20+
1621
rm -rf $HOME/ccache.7z
1722

1823
tar cf - $HOME/.ccache \

doc/source/api.rst

+10
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,15 @@ JSON
5353

5454
read_json
5555

56+
.. currentmodule:: pandas.io.json
57+
58+
.. autosummary::
59+
:toctree: generated/
60+
61+
json_normalize
62+
63+
.. currentmodule:: pandas
64+
5665
HTML
5766
~~~~
5867

@@ -563,6 +572,7 @@ strings and apply several methods to it. These can be acccessed like
563572
Series.str.slice
564573
Series.str.slice_replace
565574
Series.str.split
575+
Series.str.rsplit
566576
Series.str.startswith
567577
Series.str.strip
568578
Series.str.swapcase

doc/source/basics.rst

+74
Original file line numberDiff line numberDiff line change
@@ -624,6 +624,77 @@ We can also pass infinite values to define the bins:
624624
Function application
625625
--------------------
626626

627+
To apply your own or another library's functions to pandas objects,
628+
you should be aware of the three methods below. The appropriate
629+
method to use depends on whether your function expects to operate
630+
on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
631+
632+
1. `Tablewise Function Application`_: :meth:`~DataFrame.pipe`
633+
2. `Row or Column-wise Function Application`_: :meth:`~DataFrame.apply`
634+
3. Elementwise_ function application: :meth:`~DataFrame.applymap`
635+
636+
.. _basics.pipe:
637+
638+
Tablewise Function Application
639+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
640+
641+
.. versionadded:: 0.16.2
642+
643+
``DataFrames`` and ``Series`` can of course just be passed into functions.
644+
However, if the function needs to be called in a chain, consider using the :meth:`~DataFrame.pipe` method.
645+
Compare the following
646+
647+
.. code-block:: python
648+
649+
# f, g, and h are functions taking and returning ``DataFrames``
650+
>>> f(g(h(df), arg1=1), arg2=2, arg3=3)
651+
652+
with the equivalent
653+
654+
.. code-block:: python
655+
656+
>>> (df.pipe(h)
657+
.pipe(g, arg1=1)
658+
.pipe(f, arg2=2, arg3=3)
659+
)
660+
661+
Pandas encourages the second style, which is known as method chaining.
662+
``pipe`` makes it easy to use your own or another library's functions
663+
in method chains, alongside pandas' methods.
664+
665+
In the example above, the functions ``f``, ``g``, and ``h`` each expected the ``DataFrame`` as the first positional argument.
666+
What if the function you wish to apply takes its data as, say, the second argument?
667+
In this case, provide ``pipe`` with a tuple of ``(callable, data_keyword)``.
668+
``.pipe`` will route the ``DataFrame`` to the argument specified in the tuple.
669+
670+
For example, we can fit a regression using statsmodels. Their API expects a formula first and a ``DataFrame`` as the second argument, ``data``. We pass in the function, keyword pair ``(sm.poisson, 'data')`` to ``pipe``:
671+
672+
.. ipython:: python
673+
674+
import statsmodels.formula.api as sm
675+
676+
bb = pd.read_csv('data/baseball.csv', index_col='id')
677+
678+
(bb.query('h > 0')
679+
.assign(ln_h = lambda df: np.log(df.h))
680+
.pipe((sm.poisson, 'data'), 'hr ~ ln_h + year + g + C(lg)')
681+
.fit()
682+
.summary()
683+
)
684+
685+
The pipe method is inspired by unix pipes and more recently dplyr_ and magrittr_, which
686+
have introduced the popular ``(%>%)`` (read pipe) operator for R_.
687+
The implementation of ``pipe`` here is quite clean and feels right at home in python.
688+
We encourage you to view the source code (``pd.DataFrame.pipe??`` in IPython).
689+
690+
.. _dplyr: https://github.com/hadley/dplyr
691+
.. _magrittr: https://github.com/smbache/magrittr
692+
.. _R: http://www.r-project.org
693+
694+
695+
Row or Column-wise Function Application
696+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
697+
627698
Arbitrary functions can be applied along the axes of a DataFrame or Panel
628699
using the :meth:`~DataFrame.apply` method, which, like the descriptive
629700
statistics methods, take an optional ``axis`` argument:
@@ -678,6 +749,7 @@ Series operation on each column or row:
678749
tsdf
679750
tsdf.apply(pd.Series.interpolate)
680751
752+
681753
Finally, :meth:`~DataFrame.apply` takes an argument ``raw`` which is False by default, which
682754
converts each row or column into a Series before applying the function. When
683755
set to True, the passed function will instead receive an ndarray object, which
@@ -690,6 +762,8 @@ functionality.
690762
functionality for grouping by some criterion, applying, and combining the
691763
results into a Series, DataFrame, etc.
692764

765+
.. _Elementwise:
766+
693767
Applying elementwise Python functions
694768
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
695769

0 commit comments

Comments
 (0)