Skip to content

Commit b56fefe

Browse files
author
Artemy Kolchinsky
committed
Merge remote-tracking branch 'upstream/master'
2 parents a94b8ad + c03e92f commit b56fefe

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

93 files changed

+15971
-2127
lines changed

.travis.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ matrix:
5454
- JOB_NAME: "34_nslow"
5555
- python: 3.2
5656
env:
57-
- NOSE_ARGS="not slow and not disabled"
57+
- NOSE_ARGS="not slow and not network and not disabled"
5858
- FULL_DEPS=true
5959
- CLIPBOARD_GUI=qt4
6060
- BUILD_TYPE=pydata
@@ -71,7 +71,7 @@ matrix:
7171
allow_failures:
7272
- python: 3.2
7373
env:
74-
- NOSE_ARGS="not slow and not disabled"
74+
- NOSE_ARGS="not slow and not network and not disabled"
7575
- FULL_DEPS=true
7676
- CLIPBOARD_GUI=qt4
7777
- BUILD_TYPE=pydata

ci/requirements-3.2.txt

+1
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ tables==3.0.0
99
matplotlib==1.2.1
1010
patsy==0.1.0
1111
lxml==3.2.1
12+
html5lib
1213
scipy==0.12.0
1314
beautifulsoup4==4.2.1
1415
statsmodels==0.5.0

doc/README.rst

+2-3
Original file line numberDiff line numberDiff line change
@@ -88,13 +88,12 @@ Furthermore, it is recommended to have all `optional dependencies
8888
installed. This is not needed, but be aware that you will see some error
8989
messages. Because all the code in the documentation is executed during the doc
9090
build, the examples using this optional dependencies will generate errors.
91-
Run ``pd.show_version()`` to get an overview of the installed version of all
91+
Run ``pd.show_versions()`` to get an overview of the installed version of all
9292
dependencies.
9393

9494
.. warning::
9595

96-
Building the docs with Sphinx version 1.2 is broken. Use the
97-
latest stable version (1.2.1) or the older 1.1.3.
96+
Sphinx version >= 1.2.2 or the older 1.1.3 is required.
9897

9998
Building pandas
10099
^^^^^^^^^^^^^^^

doc/source/api.rst

+4
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,8 @@ Standard moving window functions
190190
rolling_quantile
191191
rolling_window
192192

193+
.. _api.functions_expanding:
194+
193195
Standard expanding window functions
194196
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
195197

@@ -1164,6 +1166,8 @@ Attributes
11641166

11651167
Index.values
11661168
Index.is_monotonic
1169+
Index.is_monotonic_increasing
1170+
Index.is_monotonic_decreasing
11671171
Index.is_unique
11681172
Index.dtype
11691173
Index.inferred_type

doc/source/categorical.rst

+2-1
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ which are removed are replaced by ``np.nan``.:
236236
s = s.cat.remove_categories([4])
237237
s
238238
239-
Renaming unused categories
239+
Removing unused categories
240240
~~~~~~~~~~~~~~~~~~~~~~~~~~
241241

242242
Removing unused categories can also be done:
@@ -573,6 +573,7 @@ relevant columns back to `category` and assign the right categories and categori
573573
df2.dtypes
574574
df2["cats"]
575575
576+
The same holds for writing to a SQL database with ``to_sql``.
576577

577578
Missing Data
578579
------------

doc/source/conf.py

-3
Original file line numberDiff line numberDiff line change
@@ -44,12 +44,9 @@
4444
'ipython_sphinxext.ipython_directive',
4545
'ipython_sphinxext.ipython_console_highlighting',
4646
'sphinx.ext.intersphinx',
47-
'sphinx.ext.todo',
4847
'sphinx.ext.coverage',
4948
'sphinx.ext.pngmath',
5049
'sphinx.ext.ifconfig',
51-
'matplotlib.sphinxext.only_directives',
52-
'matplotlib.sphinxext.plot_directive',
5350
]
5451

5552

doc/source/cookbook.rst

+14-1
Original file line numberDiff line numberDiff line change
@@ -588,6 +588,19 @@ Unlike agg, apply's callable is passed a sub-DataFrame which gives you access to
588588
df['beyer_shifted'] = df.groupby(level=0)['beyer'].shift(1)
589589
df
590590
591+
`Select row with maximum value from each group
592+
<http://stackoverflow.com/q/26701849/190597>`__
593+
594+
.. ipython:: python
595+
596+
df = pd.DataFrame({'host':['other','other','that','this','this'],
597+
'service':['mail','web','mail','mail','web'],
598+
'no':[1, 2, 1, 2, 1]}).set_index(['host', 'service'])
599+
mask = df.groupby(level=0).agg('idxmax')
600+
df_count = df.loc[mask['no']].reset_index()
601+
df_count
602+
603+
591604
Expanding Data
592605
**************
593606

@@ -1202,4 +1215,4 @@ of the data values:
12021215
{'height': [60, 70],
12031216
'weight': [100, 140, 180],
12041217
'sex': ['Male', 'Female']})
1205-
df
1218+
df

doc/source/io.rst

+14
Original file line numberDiff line numberDiff line change
@@ -3337,6 +3337,14 @@ With some databases, writing large DataFrames can result in errors due to packet
33373337
flavors, columns with type ``timedelta64`` will be written as integer
33383338
values as nanoseconds to the database and a warning will be raised.
33393339

3340+
.. note::
3341+
3342+
Columns of ``category`` dtype will be converted to the dense representation
3343+
as you would get with ``np.asarray(categorical)`` (e.g. for string categories
3344+
this gives an array of strings).
3345+
Because of this, reading the database table back in does **not** generate
3346+
a categorical.
3347+
33403348

33413349
Reading Tables
33423350
~~~~~~~~~~~~~~
@@ -3618,12 +3626,18 @@ outside of this range, the data is cast to ``int16``.
36183626
if ``int64`` values are larger than 2**53.
36193627

36203628
.. warning::
3629+
36213630
:class:`~pandas.io.stata.StataWriter`` and
36223631
:func:`~pandas.core.frame.DataFrame.to_stata` only support fixed width
36233632
strings containing up to 244 characters, a limitation imposed by the version
36243633
115 dta file format. Attempting to write *Stata* dta files with strings
36253634
longer than 244 characters raises a ``ValueError``.
36263635

3636+
.. warning::
3637+
3638+
*Stata* data files only support text labels for categorical data. Exporting
3639+
data frames containing categorical data will convert non-string categorical values
3640+
to strings.
36273641

36283642
.. _io.stata_reader:
36293643

doc/source/options.rst

+3-2
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ pandas namespace. To change an option, call ``set_option('option regex', new_va
8686
pd.set_option('mode.sim_interactive', True)
8787
pd.get_option('mode.sim_interactive')
8888
89-
**Note:** that the option 'mode.sim_interactive' is mostly used for debugging purposes.
89+
**Note:** that the option 'mode.sim_interactive' is mostly used for debugging purposes.
9090

9191
All options also have a default value, and you can use ``reset_option`` to do just that:
9292

@@ -213,7 +213,8 @@ will be given.
213213
214214
``display.max_info_rows``: ``df.info()`` will usually show null-counts for each column.
215215
For large frames this can be quite slow. ``max_info_rows`` and ``max_info_cols``
216-
limit this null check only to frames with smaller dimensions then specified.
216+
limit this null check only to frames with smaller dimensions then specified. Note that you
217+
can specify the option ``df.info(null_counts=True)`` to override on showing a particular frame.
217218

218219
.. ipython:: python
219220

doc/source/release.rst

+39-4
Original file line numberDiff line numberDiff line change
@@ -45,22 +45,57 @@ analysis / manipulation tool available in any language.
4545
* Binary installers on PyPI: http://pypi.python.org/pypi/pandas
4646
* Documentation: http://pandas.pydata.org
4747

48+
pandas 0.15.2
49+
-------------
50+
51+
**Release date:** (December ??, 2014)
52+
53+
This is a minor release from 0.15.1 and includes a small number of API changes, several new features, enhancements, and
54+
performance improvements along with a large number of bug fixes.
55+
56+
See the :ref:`v0.15.2 Whatsnew <whatsnew_0152>` overview for an extensive list
57+
of all API changes, enhancements and bugs that have been fixed in 0.15.2.
58+
59+
Thanks
60+
~~~~~~
61+
4862
pandas 0.15.1
4963
-------------
5064

51-
**Release date:** (November ??, 2014)
65+
**Release date:** (November 9, 2014)
5266

5367
This is a minor release from 0.15.0 and includes a small number of API changes, several new features, enhancements, and
5468
performance improvements along with a large number of bug fixes.
5569

56-
Highlights include:
57-
58-
See the :ref:`v0.15.1 Whatsnew <whatsnew_0151>` overview or the issue tracker on GitHub for an extensive list
70+
See the :ref:`v0.15.1 Whatsnew <whatsnew_0151>` overview for an extensive list
5971
of all API changes, enhancements and bugs that have been fixed in 0.15.1.
6072

6173
Thanks
6274
~~~~~~
6375

76+
- Aaron Staple
77+
- Andrew Rosenfeld
78+
- Anton I. Sipos
79+
- Artemy Kolchinsky
80+
- Bill Letson
81+
- Dave Hughes
82+
- David Stephens
83+
- Guillaume Horel
84+
- Jeff Reback
85+
- Joris Van den Bossche
86+
- Kevin Sheppard
87+
- Nick Stahl
88+
- Sanghee Kim
89+
- Stephan Hoyer
90+
- TomAugspurger
91+
- WANG Aiyong
92+
- behzad nouri
93+
- immerrr
94+
- jnmclarty
95+
- jreback
96+
- pallav-fdsi
97+
- unutbu
98+
6499
pandas 0.15.0
65100
-------------
66101

doc/source/remote_data.rst

+81-1
Original file line numberDiff line numberDiff line change
@@ -86,8 +86,29 @@ If you don't want to download all the data, more specific requests can be made.
8686
data = aapl.get_call_data(expiry=expiry)
8787
data.iloc[0:5:, 0:5]
8888
89-
Note that if you call ``get_all_data`` first, this second call will happen much faster, as the data is cached.
89+
Note that if you call ``get_all_data`` first, this second call will happen much faster,
90+
as the data is cached.
9091

92+
If a given expiry date is not available, data for the next available expiry will be
93+
returned (January 15, 2015 in the above example).
94+
95+
Available expiry dates can be accessed from the ``expiry_dates`` property.
96+
97+
.. ipython:: python
98+
99+
aapl.expiry_dates
100+
data = aapl.get_call_data(expiry=aapl.expiry_dates[0])
101+
data.iloc[0:5:, 0:5]
102+
103+
A list-like object containing dates can also be passed to the expiry parameter,
104+
returning options data for all expiry dates in the list.
105+
106+
.. ipython:: python
107+
108+
data = aapl.get_near_stock_price(expiry=aapl.expiry_dates[0:3])
109+
data.iloc[0:5:, 0:5]
110+
111+
The ``month`` and ``year`` parameters can be used to get all options data for a given month.
91112

92113
.. _remote_data.google:
93114

@@ -143,6 +164,12 @@ World Bank
143164
`World Bank's World Development Indicators <http://data.worldbank.org>`__
144165
by using the ``wb`` I/O functions.
145166

167+
Indicators
168+
~~~~~~~~~~
169+
170+
Either from exploring the World Bank site, or using the search function included,
171+
every world bank indicator is accessible.
172+
146173
For example, if you wanted to compare the Gross Domestic Products per capita in
147174
constant dollars in North America, you would use the ``search`` function:
148175

@@ -254,3 +281,56 @@ populations in rich countries tend to use cellphones at a higher rate:
254281
Skew: -2.314 Prob(JB): 1.35e-26
255282
Kurtosis: 11.077 Cond. No. 45.8
256283
==============================================================================
284+
285+
Country Codes
286+
~~~~~~~~~~~~~
287+
288+
.. versionadded:: 0.15.1
289+
290+
The ``country`` argument accepts a string or list of mixed
291+
`two <http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2>`__ or `three <http://en.wikipedia.org/wiki/ISO_3166-1_alpha-3>`__ character
292+
ISO country codes, as well as dynamic `World Bank exceptions <http://data.worldbank.org/node/18>`__ to the ISO standards.
293+
294+
For a list of the the hard-coded country codes (used solely for error handling logic) see ``pandas.io.wb.country_codes``.
295+
296+
Problematic Country Codes & Indicators
297+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
298+
299+
.. note::
300+
301+
The World Bank's country list and indicators are dynamic. As of 0.15.1,
302+
:func:`wb.download()` is more flexible. To achieve this, the warning
303+
and exception logic changed.
304+
305+
The world bank converts some country codes,
306+
in their response, which makes error checking by pandas difficult.
307+
Retired indicators still persist in the search.
308+
309+
Given the new flexibility of 0.15.1, improved error handling by the user
310+
may be necessary for fringe cases.
311+
312+
To help identify issues:
313+
314+
There are at least 4 kinds of country codes:
315+
316+
1. Standard (2/3 digit ISO) - returns data, will warn and error properly.
317+
2. Non-standard (WB Exceptions) - returns data, but will falsely warn.
318+
3. Blank - silently missing from the response.
319+
4. Bad - causes the entire response from WB to fail, always exception inducing.
320+
321+
There are at least 3 kinds of indicators:
322+
323+
1. Current - Returns data.
324+
2. Retired - Appears in search results, yet won't return data.
325+
3. Bad - Will not return data.
326+
327+
Use the ``errors`` argument to control warnings and exceptions. Setting
328+
errors to ignore or warn, won't stop failed responses. (ie, 100% bad
329+
indicators, or a single "bad" (#4 above) country code).
330+
331+
See docstrings for more info.
332+
333+
334+
335+
336+

doc/source/tutorials.rst

+15
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,21 @@ For more resources, please visit the main `repository <https://bitbucket.org/hro
109109
- Combining data from various sources
110110

111111

112+
Practical data analysis with Python
113+
-----------------------------------
114+
115+
This `guide <http://wavedatalab.github.io/datawithpython>`_ is a comprehensive introduction to the data analysis process using the Python data ecosystem and an interesting open dataset.
116+
There are four sections covering selected topics as follows:
117+
118+
- `Munging Data <http://wavedatalab.github.io/datawithpython/munge.html>`_
119+
120+
- `Aggregating Data <http://wavedatalab.github.io/datawithpython/aggregate.html>`_
121+
122+
- `Visualizing Data <http://wavedatalab.github.io/datawithpython/visualize.html>`_
123+
124+
- `Time Series <http://wavedatalab.github.io/datawithpython/timeseries.html>`_
125+
126+
112127
Excel charts with pandas, vincent and xlsxwriter
113128
------------------------------------------------
114129

doc/source/whatsnew.rst

+2
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ What's New
1818

1919
These are new features and improvements of note in each release.
2020

21+
.. include:: whatsnew/v0.15.2.txt
22+
2123
.. include:: whatsnew/v0.15.1.txt
2224

2325
.. include:: whatsnew/v0.15.0.txt

doc/source/whatsnew/v0.10.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ want to broadcast, we are phasing out this special case (Zen of Python:
4848
talking about:
4949

5050
.. ipython:: python
51+
:okwarning:
5152

5253
import pandas as pd
5354
df = pd.DataFrame(np.random.randn(6, 4),

0 commit comments

Comments
 (0)