Skip to content

Commit e7cff7d

Browse files
committed
Merge commit 'v0.10.0b1-65-g6cadd6c' into debian
* commit 'v0.10.0b1-65-g6cadd6c': BUG: workaround numpy issue in 1.6.2 on certain distros BUG: use "for" as keyword to test against ENH: option names must valid, non-keyword python identifiers TST: option names must valid, non-keyword python identifiers BUG: make sure pd.options style access triggers warnings and callbacks as needed ENH: provide attribute style access to options via pd.options REF: rename option key "print" to "display" BUG: use Series name attributes for colnames in concat with axis=1. close pandas-dev#2489 ENH: add max_cols as keyword in DataFrame.info pandas-dev#2524 BUG: allow users to set the max number of columns before per column info is hidden away pandas-dev#2524 BUG: DataFrame.from_dict does not work with dict of sequence and orient=index pandas-dev#2496 ENH: df.select uses bool(crit(x)) rather then crit(x) DOC: add FAQ section on monkey-patching BUG: diff should cast n to int pandas-dev#2523
2 parents 4eb565b + 6cadd6c commit e7cff7d

21 files changed

+325
-119
lines changed

RELEASE.rst

+6-1
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,8 @@ pandas 0.10.0
7373
backfilling time series data (GH2284_)
7474
- New option configuration system and functions `set_option`, `get_option`,
7575
`describe_option`, and `reset_option`. Deprecate `set_printoptions` and
76-
`reset_printoptions` (GH2393_)
76+
`reset_printoptions` (GH2393_).
77+
You can also access options as attributes via ``pandas.options.X``
7778
- Wide DataFrames can be viewed more easily in the console with new
7879
`expand_frame_repr` and `line_width` configuration options. This is on by
7980
default now (GH2436_)
@@ -170,6 +171,7 @@ pandas 0.10.0
170171
- Optimize ``unstack`` memory usage by compressing indices (GH2278_)
171172
- Fix HTML repr in IPython qtconsole if opening window is small (GH2275_)
172173
- Escape more special characters in console output (GH2492_)
174+
- df.select now invokes bool on the result of crit(x) (GH2487_)
173175

174176
**Bug fixes**
175177

@@ -234,6 +236,7 @@ pandas 0.10.0
234236
- DataFrame.combine_first will always result in the union of the index and
235237
columns, even if one DataFrame is length-zero (GH2525_)
236238
- Fix several DataFrame.icol/irow with duplicate indices issues (GH2228_, GH2259_)
239+
- Use Series names for column names when using concat with axis=1 (GH2489_)
237240

238241
.. _GH407: https://github.com/pydata/pandas/issues/407
239242
.. _GH821: https://github.com/pydata/pandas/issues/821
@@ -297,6 +300,7 @@ pandas 0.10.0
297300
.. _GH2447: https://github.com/pydata/pandas/issues/2447
298301
.. _GH2275: https://github.com/pydata/pandas/issues/2275
299302
.. _GH2492: https://github.com/pydata/pandas/issues/2492
303+
.. _GH2487: https://github.com/pydata/pandas/issues/2487
300304
.. _GH2273: https://github.com/pydata/pandas/issues/2273
301305
.. _GH2266: https://github.com/pydata/pandas/issues/2266
302306
.. _GH2038: https://github.com/pydata/pandas/issues/2038
@@ -351,6 +355,7 @@ pandas 0.10.0
351355
.. _GH2525: https://github.com/pydata/pandas/issues/2525
352356
.. _GH2228: https://github.com/pydata/pandas/issues/2228
353357
.. _GH2259: https://github.com/pydata/pandas/issues/2259
358+
.. _GH2489: https://github.com/pydata/pandas/issues/2489
354359

355360

356361
pandas 0.9.1

doc/source/basics.rst

+31-16
Original file line numberDiff line numberDiff line change
@@ -1049,34 +1049,49 @@ Working with package options
10491049
.. _basics.working_with_options:
10501050

10511051
Introduced in 0.10.0, pandas supports a new system for working with options.
1052-
The 4 relavent functions are available directly from the ``pandas`` namespace,
1053-
and they are:
1052+
Options have a full "dotted-style", case-insensitive name (e.g. ``display.max_rows``),
1053+
1054+
You can get/set options directly as attributes of the top-level ``options`` attribute:
1055+
1056+
.. ipython:: python
1057+
1058+
import pandas as pd
1059+
pd.options.display.max_rows
1060+
pd.options.display.max_rows = 999
1061+
pd.options.display.max_rows
1062+
1063+
1064+
There is also an API composed of 4 relavent functions, available directly from the ``pandas``
1065+
namespace, and they are:
10541066

10551067
- ``get_option`` / ``set_option`` - get/set the value of a single option.
10561068
- ``reset_option`` - reset one or more options to their default value.
10571069
- ``describe_option`` - print the descriptions of one or more options.
10581070

10591071
**Note:** developers can check out pandas/core/config.py for more info.
10601072

1061-
Options have a full "dotted-style", case-insensitive name (e.g. ``print.max_rows``),
1073+
10621074
but all of the functions above accept a regexp pattern (``re.search`` style) as argument,
10631075
so passing in a substring will work - as long as it is unambiguous :
10641076

10651077
.. ipython:: python
10661078
1067-
get_option("print.max_rows")
1068-
set_option("print.max_rows",101)
1069-
get_option("print.max_rows")
1079+
get_option("display.max_rows")
1080+
set_option("display.max_rows",101)
1081+
get_option("display.max_rows")
10701082
set_option("max_r",102)
1071-
get_option("print.max_rows")
1083+
get_option("display.max_rows")
10721084
10731085
1074-
However, the following will **not work** because it matches multiple option names, e.g.``print.max_colwidth``, ``print.max_rows``, ``print.max_columns``:
1086+
However, the following will **not work** because it matches multiple option names, e.g.``display.max_colwidth``, ``display.max_rows``, ``display.max_columns``:
10751087

10761088
.. ipython:: python
10771089
:okexcept:
10781090
1079-
get_option("print.max_")
1091+
try:
1092+
get_option("display.max_")
1093+
except KeyError as e:
1094+
print(e)
10801095
10811096
10821097
**Note:** Using this form of convenient shorthand may make your code break if new options with similar names are added in future versions.
@@ -1103,23 +1118,23 @@ All options also have a default value, and you can use the ``reset_option`` to d
11031118
.. ipython:: python
11041119
:suppress:
11051120
1106-
reset_option("print.max_rows")
1121+
reset_option("display.max_rows")
11071122
11081123
11091124
.. ipython:: python
11101125
1111-
get_option("print.max_rows")
1112-
set_option("print.max_rows",999)
1113-
get_option("print.max_rows")
1114-
reset_option("print.max_rows")
1115-
get_option("print.max_rows")
1126+
get_option("display.max_rows")
1127+
set_option("display.max_rows",999)
1128+
get_option("display.max_rows")
1129+
reset_option("display.max_rows")
1130+
get_option("display.max_rows")
11161131
11171132
11181133
and you also set multiple options at once:
11191134

11201135
.. ipython:: python
11211136
1122-
reset_option("^print\.")
1137+
reset_option("^display\.")
11231138
11241139
11251140

doc/source/faq.rst

+41-2
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,47 @@ Frequently Asked Questions (FAQ)
2121
import matplotlib.pyplot as plt
2222
plt.close('all')
2323
24+
.. _ref-monkey-patching:
25+
26+
27+
----------------------------------------------------
28+
29+
Pandas is a powerful tool and already has a plethora of data manipulation
30+
operations implemented, most of them are very fast as well.
31+
It's very possible however that certain functionality that would make your
32+
life easier is missing. In that case you have several options:
33+
34+
1) Open an issue on `Github <https://github.com/pydata/pandas/issues/>`_ , explain your need and the sort of functionality you would like to see implemented.
35+
2) Fork the repo, Implement the functionality yourself and open a PR
36+
on Github.
37+
3) Write a method that performs the operation you are interested in and
38+
Monkey-patch the pandas class as part of your IPython profile startup
39+
or PYTHONSTARTUP file.
40+
41+
For example, here is an example of adding an ``just_foo_cols()``
42+
method to the dataframe class:
43+
44+
.. ipython:: python
45+
46+
import pandas as pd
47+
def just_foo_cols(self):
48+
"""Get a list of column names containing the string 'foo'
49+
50+
"""
51+
return [x for x in self.columns if 'foo' in x]
52+
53+
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
54+
df = pd.DataFrame([range(4)],columns= ["A","foo","foozball","bar"])
55+
df.just_foo_cols()
56+
del pd.DataFrame.just_foo_cols # you can also remove the new method
57+
58+
59+
Monkey-patching is usually frowned upon because it makes your code
60+
less portable and can cause subtle bugs in some circumstances.
61+
Monkey-patching existing methods is usually a bad idea in that respect.
62+
When used with proper care, however, it's a very useful tool to have.
63+
64+
2465
.. _ref-scikits-migration:
2566

2667
Migrating from scikits.timeseries to pandas >= 0.8.0
@@ -171,5 +212,3 @@ interval (``'start'`` or ``'end'``) convention:
171212
data = Series(np.random.randn(50), index=rng)
172213
resampled = data.resample('A', kind='timestamp', convention='end')
173214
resampled.index
174-
175-

doc/source/v0.10.0.txt

+3-3
Original file line numberDiff line numberDiff line change
@@ -160,11 +160,11 @@ Convenience methods ``ffill`` and ``bfill`` have been added:
160160
arguments. print all registered options.
161161

162162
Note: ``set_printoptions``/ ``reset_printoptions`` are now deprecated (but
163-
functioning), the print options now live under "print.XYZ". For example:
163+
functioning), the print options now live under "display.XYZ". For example:
164164

165165
.. ipython:: python
166166

167-
get_option("print.max_rows")
167+
get_option("display.max_rows")
168168

169169
- to_string() methods now always return unicode strings (GH2224_).
170170

@@ -284,7 +284,7 @@ Updated PyTables Support
284284
df['string'] = 'string'
285285
df['int'] = 1
286286
store.append('df',df)
287-
df1 = store.select('df')
287+
df1 = store.select('df')
288288
df1
289289
df1.get_dtype_counts()
290290

pandas/core/api.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,4 @@
3131
import pandas.core.datetools as datetools
3232

3333
from pandas.core.config import get_option,set_option,reset_option,\
34-
describe_option
34+
describe_option, options

pandas/core/common.py

+6-5
Original file line numberDiff line numberDiff line change
@@ -455,6 +455,7 @@ def mask_out_axis(arr, mask, axis, fill_value=np.nan):
455455
}
456456

457457
def diff(arr, n, axis=0):
458+
n = int(n)
458459
dtype = arr.dtype
459460
if issubclass(dtype.type, np.integer):
460461
dtype = np.float64
@@ -1212,7 +1213,7 @@ def in_qtconsole():
12121213
# 2) If you need to send something to the console, use console_encode().
12131214
#
12141215
# console_encode() should (hopefully) choose the right encoding for you
1215-
# based on the encoding set in option "print.encoding"
1216+
# based on the encoding set in option "display.encoding"
12161217
#
12171218
# 3) if you need to write something out to file, use
12181219
# pprint_thing_encoded(encoding).
@@ -1271,10 +1272,10 @@ def pprint_thing(thing, _nest_lvl=0, escape_chars=None):
12711272
hasattr(thing,'next'):
12721273
return unicode(thing)
12731274
elif (isinstance(thing, dict) and
1274-
_nest_lvl < get_option("print.pprint_nest_depth")):
1275+
_nest_lvl < get_option("display.pprint_nest_depth")):
12751276
result = _pprint_dict(thing, _nest_lvl)
12761277
elif _is_sequence(thing) and _nest_lvl < \
1277-
get_option("print.pprint_nest_depth"):
1278+
get_option("display.pprint_nest_depth"):
12781279
result = _pprint_seq(thing, _nest_lvl, escape_chars=escape_chars)
12791280
else:
12801281
# when used internally in the package, everything
@@ -1312,8 +1313,8 @@ def console_encode(object, **kwds):
13121313
this is the sanctioned way to prepare something for
13131314
sending *to the console*, it delegates to pprint_thing() to get
13141315
a unicode representation of the object relies on the global encoding
1315-
set in print.encoding. Use this everywhere
1316+
set in display.encoding. Use this everywhere
13161317
where you output to the console.
13171318
"""
13181319
return pprint_thing_encoded(object,
1319-
get_option("print.encoding"))
1320+
get_option("display.encoding"))

pandas/core/config.py

+43-2
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,39 @@ def _reset_option(pat):
138138
for k in keys:
139139
_set_option(k, _registered_options[k].defval)
140140

141+
class DictWrapper(object):
142+
""" provide attribute-style access to a nested dict
143+
"""
144+
def __init__(self,d,prefix=""):
145+
object.__setattr__(self,"d",d)
146+
object.__setattr__(self,"prefix",prefix)
147+
148+
def __setattr__(self,key,val):
149+
prefix = object.__getattribute__(self,"prefix")
150+
if prefix:
151+
prefix += "."
152+
prefix += key
153+
# you can't set new keys
154+
# can you can't overwrite subtrees
155+
if key in self.d and not isinstance(self.d[key],dict):
156+
_set_option(prefix,val)
157+
self.d[key]=val
158+
else:
159+
raise KeyError("You can only set the value of existing options")
160+
161+
def __getattr__(self,key):
162+
prefix = object.__getattribute__(self,"prefix")
163+
if prefix:
164+
prefix += "."
165+
prefix += key
166+
v=object.__getattribute__(self,"d")[key]
167+
if isinstance(v,dict):
168+
return DictWrapper(v,prefix)
169+
else:
170+
return _get_option(prefix)
171+
172+
def __dir__(self):
173+
return self.d.keys()
141174

142175
# For user convenience, we'd like to have the available options described
143176
# in the docstring. For dev convenience we'd like to generate the docstrings
@@ -266,7 +299,7 @@ def __doc__(self):
266299
set_option = CallableDyanmicDoc(_set_option, _set_option_tmpl)
267300
reset_option = CallableDyanmicDoc(_reset_option, _reset_option_tmpl)
268301
describe_option = CallableDyanmicDoc(_describe_option, _describe_option_tmpl)
269-
302+
options = DictWrapper(_global_config)
270303

271304
######################################################
272305
# Functions for use by pandas developers, in addition to User - api
@@ -316,7 +349,7 @@ def register_option(key, defval, doc='', validator=None, cb=None):
316349
ValueError if `validator` is specified and `defval` is not a valid value.
317350
318351
"""
319-
352+
import tokenize, keyword
320353
key = key.lower()
321354

322355
if key in _registered_options:
@@ -330,6 +363,13 @@ def register_option(key, defval, doc='', validator=None, cb=None):
330363

331364
# walk the nested dict, creating dicts as needed along the path
332365
path = key.split('.')
366+
367+
for k in path:
368+
if not bool(re.match('^'+tokenize.Name+'$', k)):
369+
raise ValueError("%s is not a valid identifier" % k)
370+
if keyword.iskeyword(key):
371+
raise ValueError("%s is a python keyword" % k)
372+
333373
cursor = _global_config
334374
for i, p in enumerate(path[:-1]):
335375
if not isinstance(cursor, dict):
@@ -343,6 +383,7 @@ def register_option(key, defval, doc='', validator=None, cb=None):
343383
raise KeyError("Path prefix to option '%s' is already an option"
344384
% '.'.join(path[:-1]))
345385

386+
346387
cursor[path[-1]] = defval # initialize
347388

348389
# save the option metadata

pandas/core/config_init.py

+10-2
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717

1818
###########################################
19-
# options from the "print" namespace
19+
# options from the "display" namespace
2020

2121
pc_precision_doc="""
2222
: int
@@ -45,6 +45,12 @@
4545
columns that can fit on it.
4646
"""
4747

48+
pc_max_info_cols_doc="""
49+
: int
50+
max_info_columns is used in DataFrame.info method to decide if
51+
per column information will be printed.
52+
"""
53+
4854
pc_nb_repr_h_doc="""
4955
: boolean
5056
When True (default), IPython notebook will use html representation for
@@ -115,13 +121,15 @@
115121
When printing wide DataFrames, this is the width of each line.
116122
"""
117123

118-
with cf.config_prefix('print'):
124+
with cf.config_prefix('display'):
119125
cf.register_option('precision', 7, pc_precision_doc, validator=is_int)
120126
cf.register_option('float_format', None, float_format_doc)
121127
cf.register_option('column_space', 12, validator=is_int)
122128
cf.register_option('max_rows', 100, pc_max_rows_doc, validator=is_int)
123129
cf.register_option('max_colwidth', 50, max_colwidth_doc, validator=is_int)
124130
cf.register_option('max_columns', 20, pc_max_cols_doc, validator=is_int)
131+
cf.register_option('max_info_columns', 100, pc_max_info_cols_doc,
132+
validator=is_int)
125133
cf.register_option('colheader_justify', 'right', colheader_justify_doc,
126134
validator=is_text)
127135
cf.register_option('notebook_repr_html', True, pc_nb_repr_h_doc,

0 commit comments

Comments
 (0)