Skip to content

Commit 80c7a53

Browse files
committed
Merge branch 'master' of github.com:pydata/pandas into jcrist-period_dtime64
2 parents ce89702 + 8fe1cf6 commit 80c7a53

38 files changed

+516
-161
lines changed

ci/script.sh

+2
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ fi
1616
"$TRAVIS_BUILD_DIR"/ci/build_docs.sh 2>&1 > /tmp/doc.log &
1717
# doc build log will be shown after tests
1818

19+
pip install -U blosc # See https://github.com/pydata/pandas/pull/9783
20+
python -c 'import blosc; blosc.print_versions()'
1921

2022
echo nosetests --exe -A "$NOSE_ARGS" pandas --with-xunit --xunit-file=/tmp/nosetests.xml
2123
nosetests --exe -A "$NOSE_ARGS" pandas --with-xunit --xunit-file=/tmp/nosetests.xml

doc/source/indexing.rst

+8
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,14 @@ new column.
249249
If you are using the IPython environment, you may also use tab-completion to
250250
see these accessible attributes.
251251

252+
You can also assign a ``dict`` to a row of a ``DataFrame``:
253+
254+
.. ipython:: python
255+
256+
x = pd.DataFrame({'x': [1, 2, 3], 'y': [3, 4, 5]})
257+
x.iloc[1] = dict(x=9, y=99)
258+
x
259+
252260
Slicing ranges
253261
--------------
254262

doc/source/whatsnew/v0.16.1.txt

+41-14
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,8 @@ Enhancements
5151
- ``get_dummies`` function now accepts ``sparse`` keyword. If set to ``True``, the return ``DataFrame`` is sparse, e.g. ``SparseDataFrame``. (:issue:`8823`)
5252
- ``Period`` now accepts ``datetime64`` as value input. (:issue:`9054`)
5353

54+
- Allow timedelta string conversion when leading zero is missing from time definition, ie `0:00:00` vs `00:00:00`. (:issue:`9570`)
55+
5456
.. _whatsnew_0161.api:
5557

5658
API changes
@@ -94,34 +96,59 @@ Bug Fixes
9496
- Fixed bug (:issue:`9542`) where labels did not appear properly in legend of ``DataFrame.plot()``. Passing ``label=`` args also now works, and series indices are no longer mutated.
9597
- Bug in json serialization when frame has length zero.(:issue:`9805`)
9698
- Bug in `read_csv` where missing trailing delimiters would cause segfault. (:issue:`5664`)
97-
98-
99+
- Bug in retaining index name on appending (:issue:`9862`)
99100
- Bug in ``scatter_matrix`` draws unexpected axis ticklabels (:issue:`5662`)
100-
101101
- Fixed bug in ``StataWriter`` resulting in changes to input ``DataFrame`` upon save (:issue:`9795`).
102-
103-
104102
- Bug in ``transform`` causing length mismatch when null entries were present and a fast aggregator was being used (:issue:`9697`)
105-
106103
- Bug in ``equals`` causing false negatives when block order differed (:issue:`9330`)
107-
104+
- Bug in ``read_sql_table`` error when reading postgres table with timezone (:issue:`7139`)
108105
- Bug in ``DataFrame`` slicing may not retain metadata (:issue:`9776`)
109106
- Bug where ``TimdeltaIndex`` were not properly serialized in fixed ``HDFStore`` (:issue:`9635`)
110-
111107
- Bug in plotting continuously using ``secondary_y`` may not show legend properly. (:issue:`9610`, :issue:`9779`)
112-
113108
- Bug in ``DataFrame.plot(kind="hist")`` results in ``TypeError`` when ``DataFrame`` contains non-numeric columns (:issue:`9853`)
114109
- Bug where repeated plotting of ``DataFrame`` with a ``DatetimeIndex`` may raise ``TypeError`` (:issue:`9852`)
115-
116110
- Bug in ``Series.quantile`` on empty Series of type ``Datetime`` or ``Timedelta`` (:issue:`9675`)
117111
- Bug in ``where`` causing incorrect results when upcasting was required (:issue:`9731`)
118112
- Bug in ``FloatArrayFormatter`` where decision boundary for displaying "small" floats in decimal format is off by one order of magnitude for a given display.precision (:issue:`9764`)
119-
120113
- Fixed bug where ``DataFrame.plot()`` raised an error when both ``color`` and ``style`` keywords were passed and there was no color symbol in the style strings (:issue:`9671`)
121114
- Bug in ``read_csv`` and ``read_table`` when using ``skip_rows`` parameter if blank lines are present. (:issue:`9832`)
122-
123115
- Bug in ``read_csv()`` interprets ``index_col=True`` as ``1`` (:issue:`9798`)
124-
116+
- Bug in index equality comparisons using ``==`` failing on Index/MultiIndex type incompatibility (:issue:`9875`)
125117
- Bug in which ``SparseDataFrame`` could not take `nan` as a column name (:issue:`8822`)
126-
118+
- Bug in ``Series.quantile`` on empty Series of type ``Datetime`` or ``Timedelta`` (:issue:`9675`)
119+
- Bug in ``to_msgpack`` and ``read_msgpack`` zlib and blosc compression support (:issue:`9783`)
127120
- Bug in unequal comparisons between a ``Series`` of dtype `"category"` and a scalar (e.g. ``Series(Categorical(list("abc"), categories=list("cba"), ordered=True)) > "b"``, which wouldn't use the order of the categories but use the lexicographical order. (:issue:`9848`)
121+
122+
123+
124+
125+
126+
127+
128+
129+
130+
131+
132+
133+
134+
135+
136+
- Bug in unequal comparisons between categorical data and a scalar, which was not in the categories (e.g. ``Series(Categorical(list("abc"), ordered=True)) > "d"``. This returned ``False`` for all elements, but now raises a ``TypeError``. Equality comparisons also now return ``False`` for ``==`` and ``True`` for ``!=``. (:issue:`9848`)
137+
138+
- Bug in DataFrame ``__setitem__`` when right hand side is a dictionary (:issue:`9874`)
139+
140+
- Bug in ``MultiIndex.sortlevel()`` results in unicode level name breaks (:issue:`9875`)
141+
- Bug in which ``groupby.transform`` incorrectly enforced output dtypes to match input dtypes. (:issue:`9807`)
142+
143+
144+
145+
146+
147+
148+
149+
150+
151+
152+
153+
- Bug where dividing a dataframe containing values of type ``Decimal`` by another ``Decimal`` would raise. (:issue:`9787`)
154+
- Bug where using DataFrames asfreq would remove the name of the index. (:issue:`9885`)

pandas/core/base.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
import pandas.lib as lib
1212
from pandas.util.decorators import Appender, cache_readonly
1313
from pandas.core.strings import StringMethods
14-
14+
from pandas.core.common import AbstractMethodError
1515

1616
_shared_docs = dict()
1717
_indexops_doc_kwargs = dict(klass='IndexOpsMixin', inplace='',
@@ -32,7 +32,7 @@ class StringMixin(object):
3232
# Formatting
3333

3434
def __unicode__(self):
35-
raise NotImplementedError
35+
raise AbstractMethodError(self)
3636

3737
def __str__(self):
3838
"""
@@ -566,4 +566,4 @@ def duplicated(self, take_last=False):
566566
# abstracts
567567

568568
def _update_inplace(self, result, **kwargs):
569-
raise NotImplementedError
569+
raise AbstractMethodError(self)

pandas/core/categorical.py

+10-2
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,14 @@ def f(self, other):
6161
i = self.categories.get_loc(other)
6262
return getattr(self._codes, op)(i)
6363
else:
64-
return np.repeat(False, len(self))
64+
if op == '__eq__':
65+
return np.repeat(False, len(self))
66+
elif op == '__ne__':
67+
return np.repeat(True, len(self))
68+
else:
69+
msg = "Cannot compare a Categorical for op {op} with a scalar, " \
70+
"which is not a category."
71+
raise TypeError(msg.format(op=op))
6572
else:
6673

6774
# allow categorical vs object dtype array comparisons for equality
@@ -1159,7 +1166,8 @@ def fillna(self, fill_value=None, method=None, limit=None):
11591166
if fill_value is None:
11601167
fill_value = np.nan
11611168
if limit is not None:
1162-
raise NotImplementedError
1169+
raise NotImplementedError("specifying a limit for fillna has not "
1170+
"been implemented yet")
11631171

11641172
values = self._codes
11651173

pandas/core/common.py

+27-4
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,17 @@ class AmbiguousIndexError(PandasError, KeyError):
3939
pass
4040

4141

42+
class AbstractMethodError(NotImplementedError):
43+
"""Raise this error instead of NotImplementedError for abstract methods
44+
while keeping compatibility with Python 2 and Python 3.
45+
"""
46+
def __init__(self, class_instance):
47+
self.class_instance = class_instance
48+
49+
def __str__(self):
50+
return "This method must be defined on the concrete class of " \
51+
+ self.class_instance.__class__.__name__
52+
4253
_POSSIBLY_CAST_DTYPES = set([np.dtype(t).name
4354
for t in ['O', 'int8',
4455
'uint8', 'int16', 'uint16', 'int32',
@@ -1397,14 +1408,19 @@ def _fill_zeros(result, x, y, name, fill):
13971408
13981409
mask the nan's from x
13991410
"""
1400-
14011411
if fill is None or is_float_dtype(result):
14021412
return result
14031413

14041414
if name.startswith(('r', '__r')):
14051415
x,y = y,x
14061416

1407-
if np.isscalar(y):
1417+
is_typed_variable = (hasattr(y, 'dtype') or hasattr(y,'type'))
1418+
is_scalar = lib.isscalar(y)
1419+
1420+
if not is_typed_variable and not is_scalar:
1421+
return result
1422+
1423+
if is_scalar:
14081424
y = np.array(y)
14091425

14101426
if is_integer_dtype(y):
@@ -2439,7 +2455,10 @@ def _get_dtype_type(arr_or_dtype):
24392455
return np.dtype(arr_or_dtype).type
24402456
elif isinstance(arr_or_dtype, CategoricalDtype):
24412457
return CategoricalDtypeType
2442-
return arr_or_dtype.dtype.type
2458+
try:
2459+
return arr_or_dtype.dtype.type
2460+
except AttributeError:
2461+
raise ValueError('%r is not a dtype' % arr_or_dtype)
24432462

24442463

24452464
def is_any_int_dtype(arr_or_dtype):
@@ -2510,7 +2529,11 @@ def is_floating_dtype(arr_or_dtype):
25102529

25112530

25122531
def is_bool_dtype(arr_or_dtype):
2513-
tipo = _get_dtype_type(arr_or_dtype)
2532+
try:
2533+
tipo = _get_dtype_type(arr_or_dtype)
2534+
except ValueError:
2535+
# this isn't even a dtype
2536+
return False
25142537
return issubclass(tipo, np.bool_)
25152538

25162539
def is_categorical(array):

pandas/core/frame.py

+5-5
Original file line numberDiff line numberDiff line change
@@ -4414,12 +4414,12 @@ def mode(self, axis=0, numeric_only=False):
44144414
"""
44154415
Gets the mode(s) of each element along the axis selected. Empty if nothing
44164416
has 2+ occurrences. Adds a row for each mode per label, fills in gaps
4417-
with nan.
4418-
4417+
with nan.
4418+
44194419
Note that there could be multiple values returned for the selected
4420-
axis (when more than one item share the maximum frequency), which is the
4421-
reason why a dataframe is returned. If you want to impute missing values
4422-
with the mode in a dataframe ``df``, you can just do this:
4420+
axis (when more than one item share the maximum frequency), which is the
4421+
reason why a dataframe is returned. If you want to impute missing values
4422+
with the mode in a dataframe ``df``, you can just do this:
44234423
``df.fillna(df.mode().iloc[0])``
44244424
44254425
Parameters

pandas/core/generic.py

+9-6
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,8 @@
2121
from pandas.core.common import (isnull, notnull, is_list_like,
2222
_values_from_object, _maybe_promote,
2323
_maybe_box_datetimelike, ABCSeries,
24-
SettingWithCopyError, SettingWithCopyWarning)
24+
SettingWithCopyError, SettingWithCopyWarning,
25+
AbstractMethodError)
2526
import pandas.core.nanops as nanops
2627
from pandas.util.decorators import Appender, Substitution, deprecate_kwarg
2728
from pandas.core import config
@@ -137,7 +138,7 @@ def _init_mgr(self, mgr, axes=None, dtype=None, copy=False):
137138

138139
@property
139140
def _constructor(self):
140-
raise NotImplementedError
141+
raise AbstractMethodError(self)
141142

142143
def __unicode__(self):
143144
# unicode representation based upon iterating over self
@@ -152,7 +153,7 @@ def _local_dir(self):
152153

153154
@property
154155
def _constructor_sliced(self):
155-
raise NotImplementedError
156+
raise AbstractMethodError(self)
156157

157158
#----------------------------------------------------------------------
158159
# Axis
@@ -1100,7 +1101,7 @@ def _iget_item_cache(self, item):
11001101
return lower
11011102

11021103
def _box_item_values(self, key, values):
1103-
raise NotImplementedError
1104+
raise AbstractMethodError(self)
11041105

11051106
def _maybe_cache_changed(self, item, value):
11061107
"""
@@ -3057,7 +3058,8 @@ def first(self, offset):
30573058
"""
30583059
from pandas.tseries.frequencies import to_offset
30593060
if not isinstance(self.index, DatetimeIndex):
3060-
raise NotImplementedError
3061+
raise NotImplementedError("'first' only supports a DatetimeIndex "
3062+
"index")
30613063

30623064
if len(self.index) == 0:
30633065
return self
@@ -3091,7 +3093,8 @@ def last(self, offset):
30913093
"""
30923094
from pandas.tseries.frequencies import to_offset
30933095
if not isinstance(self.index, DatetimeIndex):
3094-
raise NotImplementedError
3096+
raise NotImplementedError("'last' only supports a DatetimeIndex "
3097+
"index")
30953098

30963099
if len(self.index) == 0:
30973100
return self

0 commit comments

Comments
 (0)