Skip to content

Commit 6696f28

Browse files
committed
Merge remote-tracking branch 'upstream/master' into sparse-frame-accessor
2 parents 534a379 + 1017382 commit 6696f28

37 files changed

+351
-366
lines changed

doc/source/whatsnew/v0.24.2.rst

+5-2
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
.. _whatsnew_0242:
44

5-
Whats New in 0.24.2 (February XX, 2019)
6-
---------------------------------------
5+
Whats New in 0.24.2 (March 12, 2019)
6+
------------------------------------
77

88
.. warning::
99

@@ -33,6 +33,8 @@ Fixed Regressions
3333
- Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`)
3434
- Fixed regression in :func:`to_timedelta` losing precision when converting floating data to ``Timedelta`` data (:issue:`25077`).
3535
- Fixed pip installing from source into an environment without NumPy (:issue:`25193`)
36+
- Fixed regression in :meth:`DataFrame.replace` where large strings of numbers would be coerced into ``int64``, causing an ``OverflowError`` (:issue:`25616`)
37+
- Fixed regression in :func:`factorize` when passing a custom ``na_sentinel`` value with ``sort=True`` (:issue:`25409`).
3638
- Fixed regression in :meth:`DataFrame.to_csv` writing duplicate line endings with gzip compress (:issue:`25311`)
3739

3840
.. _whatsnew_0242.bug_fixes:
@@ -89,6 +91,7 @@ A total of 25 people contributed patches to this release. People with a "+" by t
8991
* Joris Van den Bossche
9092
* Josh
9193
* Justin Zheng
94+
* Kendall Masse
9295
* Matthew Roeschke
9396
* Max Bolingbroke +
9497
* rbenes +

doc/source/whatsnew/v0.25.0.rst

+4-2
Original file line numberDiff line numberDiff line change
@@ -87,14 +87,15 @@ Other API Changes
8787
- :class:`DatetimeTZDtype` will now standardize pytz timezones to a common timezone instance (:issue:`24713`)
8888
- ``Timestamp`` and ``Timedelta`` scalars now implement the :meth:`to_numpy` method as aliases to :meth:`Timestamp.to_datetime64` and :meth:`Timedelta.to_timedelta64`, respectively. (:issue:`24653`)
8989
- :meth:`Timestamp.strptime` will now rise a ``NotImplementedError`` (:issue:`25016`)
90-
-
90+
- Bug in :meth:`DatetimeIndex.snap` which didn't preserving the ``name`` of the input :class:`Index` (:issue:`25575`)
9191

9292
.. _whatsnew_0250.deprecations:
9393

9494
Deprecations
9595
~~~~~~~~~~~~
9696

9797
- Deprecated the `M (months)` and `Y (year)` `units` parameter of :func: `pandas.to_timedelta`, :func: `pandas.Timedelta` and :func: `pandas.TimedeltaIndex` (:issue:`16344`)
98+
- The functions :func:`pandas.to_datetime` and :func:`pandas.to_timedelta` have deprecated the ``box`` keyword. Instead, use :meth:`to_numpy` or :meth:`Timestamp.to_datetime64`/:meth:`Timedelta.to_timedelta64`. (:issue:`24416`)
9899

99100
.. _whatsnew_0250.prior_deprecations:
100101

@@ -215,10 +216,10 @@ I/O
215216
- Bug in :func:`read_json` for ``orient='table'`` when it tries to infer dtypes by default, which is not applicable as dtypes are already defined in the JSON schema (:issue:`21345`)
216217
- Bug in :func:`read_json` for ``orient='table'`` and float index, as it infers index dtype by default, which is not applicable because index dtype is already defined in the JSON schema (:issue:`25433`)
217218
- Bug in :func:`read_json` for ``orient='table'`` and string of float column names, as it makes a column name type conversion to Timestamp, which is not applicable because column names are already defined in the JSON schema (:issue:`25435`)
219+
- Bug in :func:`json_normalize` for ``errors='ignore'`` where missing values in the input data, were filled in resulting ``DataFrame`` with the string "nan" instead of ``numpy.nan`` (:issue:`25468`)
218220
- :meth:`DataFrame.to_html` now raises ``TypeError`` when using an invalid type for the ``classes`` parameter instead of ``AsseertionError`` (:issue:`25608`)
219221
- Bug in :meth:`DataFrame.to_string` and :meth:`DataFrame.to_latex` that would lead to incorrect output when the ``header`` keyword is used (:issue:`16718`)
220222
-
221-
-
222223

223224

224225
Plotting
@@ -243,6 +244,7 @@ Reshaping
243244
- Bug in :func:`pandas.merge` adds a string of ``None`` if ``None`` is assigned in suffixes instead of remain the column name as-is (:issue:`24782`).
244245
- Bug in :func:`merge` when merging by index name would sometimes result in an incorrectly numbered index (:issue:`24212`)
245246
- :func:`to_records` now accepts dtypes to its `column_dtypes` parameter (:issue:`24895`)
247+
- Bug in :func:`concat` where order of ``OrderedDict`` (and ``dict`` in Python 3.6+) is not respected, when passed in as ``objs`` argument (:issue:`21510`)
246248

247249

248250
Sparse

environment.yml

+1
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ dependencies:
1919
- hypothesis>=3.82
2020
- isort
2121
- moto
22+
- pycodestyle=2.4
2223
- pytest>=4.0.2
2324
- pytest-mock
2425
- sphinx

pandas/core/algorithms.py

+13-7
Original file line numberDiff line numberDiff line change
@@ -619,13 +619,19 @@ def factorize(values, sort=False, order=None, na_sentinel=-1, size_hint=None):
619619

620620
if sort and len(uniques) > 0:
621621
from pandas.core.sorting import safe_sort
622-
try:
623-
order = uniques.argsort()
624-
order2 = order.argsort()
625-
labels = take_1d(order2, labels, fill_value=na_sentinel)
626-
uniques = uniques.take(order)
627-
except TypeError:
628-
# Mixed types, where uniques.argsort fails.
622+
if na_sentinel == -1:
623+
# GH-25409 take_1d only works for na_sentinels of -1
624+
try:
625+
order = uniques.argsort()
626+
order2 = order.argsort()
627+
labels = take_1d(order2, labels, fill_value=na_sentinel)
628+
uniques = uniques.take(order)
629+
except TypeError:
630+
# Mixed types, where uniques.argsort fails.
631+
uniques, labels = safe_sort(uniques, labels,
632+
na_sentinel=na_sentinel,
633+
assume_unique=True)
634+
else:
629635
uniques, labels = safe_sort(uniques, labels,
630636
na_sentinel=na_sentinel,
631637
assume_unique=True)

pandas/core/categorical.py

-9
This file was deleted.

pandas/core/dtypes/cast.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -794,10 +794,10 @@ def soft_convert_objects(values, datetime=True, numeric=True, timedelta=True,
794794
# Immediate return if coerce
795795
if datetime:
796796
from pandas import to_datetime
797-
return to_datetime(values, errors='coerce', box=False)
797+
return to_datetime(values, errors='coerce').to_numpy()
798798
elif timedelta:
799799
from pandas import to_timedelta
800-
return to_timedelta(values, errors='coerce', box=False)
800+
return to_timedelta(values, errors='coerce').to_numpy()
801801
elif numeric:
802802
from pandas import to_numeric
803803
return to_numeric(values, errors='coerce')

pandas/core/groupby/generic.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -822,7 +822,7 @@ def _aggregate_multiple_funcs(self, arg, _level):
822822
columns.append(com.get_callable_name(f))
823823
arg = lzip(columns, arg)
824824

825-
results = {}
825+
results = collections.OrderedDict()
826826
for name, func in arg:
827827
obj = self
828828
if name in results:

pandas/core/indexes/datetimelike.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,8 @@ def asobject(self):
300300
return self.astype(object)
301301

302302
def _convert_tolerance(self, tolerance, target):
303-
tolerance = np.asarray(to_timedelta(tolerance, box=False))
303+
tolerance = np.asarray(to_timedelta(tolerance).to_numpy())
304+
304305
if target.size != tolerance.size and tolerance.size > 1:
305306
raise ValueError('list-like tolerance size must match '
306307
'target index size')

pandas/core/indexes/datetimes.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -787,8 +787,8 @@ def snap(self, freq='S'):
787787
snapped[i] = s
788788

789789
# we know it conforms; skip check
790-
return DatetimeIndex._simple_new(snapped, freq=freq)
791-
# TODO: what about self.name? tz? if so, use shallow_copy?
790+
return DatetimeIndex._simple_new(snapped, name=self.name, tz=self.tz,
791+
freq=freq)
792792

793793
def join(self, other, how='left', level=None, return_indexers=False,
794794
sort=False):

pandas/core/internals/blocks.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -1079,7 +1079,7 @@ def coerce_to_target_dtype(self, other):
10791079

10801080
try:
10811081
return self.astype(dtype)
1082-
except (ValueError, TypeError):
1082+
except (ValueError, TypeError, OverflowError):
10831083
pass
10841084

10851085
return self.astype(object)
@@ -3210,7 +3210,7 @@ def _putmask_smart(v, m, n):
32103210
nv = v.copy()
32113211
nv[m] = nn_at
32123212
return nv
3213-
except (ValueError, IndexError, TypeError):
3213+
except (ValueError, IndexError, TypeError, OverflowError):
32143214
pass
32153215

32163216
n = np.asarray(n)

pandas/core/reshape/concat.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -253,7 +253,7 @@ def __init__(self, objs, axis=0, join='outer', join_axes=None,
253253

254254
if isinstance(objs, dict):
255255
if keys is None:
256-
keys = sorted(objs)
256+
keys = com.dict_keys_to_ordered_list(objs)
257257
objs = [objs[k] for k in keys]
258258
else:
259259
objs = list(objs)

pandas/core/tools/datetimes.py

+8
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
DateParseError, _format_is_iso, _guess_datetime_format, parse_time_string)
1010
from pandas._libs.tslibs.strptime import array_strptime
1111
from pandas.compat import zip
12+
from pandas.util._decorators import deprecate_kwarg
1213

1314
from pandas.core.dtypes.common import (
1415
ensure_object, is_datetime64_dtype, is_datetime64_ns_dtype,
@@ -398,6 +399,7 @@ def _adjust_to_origin(arg, origin, unit):
398399
return arg
399400

400401

402+
@deprecate_kwarg(old_arg_name='box', new_arg_name=None)
401403
def to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False,
402404
utc=None, box=True, format=None, exact=True,
403405
unit=None, infer_datetime_format=False, origin='unix',
@@ -444,6 +446,12 @@ def to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False,
444446
445447
- If True returns a DatetimeIndex or Index-like object
446448
- If False returns ndarray of values.
449+
450+
.. deprecated:: 0.25.0
451+
Use :meth:`.to_numpy` or :meth:`Timestamp.to_datetime64`
452+
instead to get an ndarray of values or numpy.datetime64,
453+
respectively.
454+
447455
format : string, default None
448456
strftime to parse time, eg "%d/%m/%Y", note that "%f" will parse
449457
all the way up to nanoseconds.

pandas/core/tools/timedeltas.py

+8
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,15 @@
88

99
from pandas._libs.tslibs import NaT
1010
from pandas._libs.tslibs.timedeltas import Timedelta, parse_timedelta_unit
11+
from pandas.util._decorators import deprecate_kwarg
1112

1213
from pandas.core.dtypes.common import is_list_like
1314
from pandas.core.dtypes.generic import ABCIndexClass, ABCSeries
1415

1516
from pandas.core.arrays.timedeltas import sequence_to_td64ns
1617

1718

19+
@deprecate_kwarg(old_arg_name='box', new_arg_name=None)
1820
def to_timedelta(arg, unit='ns', box=True, errors='raise'):
1921
"""
2022
Convert argument to timedelta.
@@ -40,6 +42,12 @@ def to_timedelta(arg, unit='ns', box=True, errors='raise'):
4042
- If True returns a Timedelta/TimedeltaIndex of the results.
4143
- If False returns a numpy.timedelta64 or numpy.darray of
4244
values of dtype timedelta64[ns].
45+
46+
.. deprecated:: 0.25.0
47+
Use :meth:`.to_numpy` or :meth:`Timedelta.to_timedelta64`
48+
instead to get an ndarray of values or numpy.timedelta64,
49+
respectively.
50+
4351
errors : {'ignore', 'raise', 'coerce'}, default 'raise'
4452
- If 'raise', then invalid parsing will raise an exception.
4553
- If 'coerce', then invalid parsing will be set as NaT.

pandas/io/json/normalize.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,7 @@ def _recursive_extract(data, path, seen_meta, level=0):
281281
raise ValueError('Conflicting metadata name {name}, '
282282
'need distinguishing prefix '.format(name=k))
283283

284-
result[k] = np.array(v).repeat(lengths)
284+
# forcing dtype to object to avoid the metadata being casted to string
285+
result[k] = np.array(v, dtype=object).repeat(lengths)
285286

286287
return result

pandas/io/parsers.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -3164,11 +3164,11 @@ def converter(*date_cols):
31643164
return tools.to_datetime(
31653165
ensure_object(strs),
31663166
utc=None,
3167-
box=False,
31683167
dayfirst=dayfirst,
31693168
errors='ignore',
31703169
infer_datetime_format=infer_datetime_format
3171-
)
3170+
).to_numpy()
3171+
31723172
except ValueError:
31733173
return tools.to_datetime(
31743174
parsing.try_parse_dates(strs, dayfirst=dayfirst))

pandas/tests/api/test_api.py

+2-19
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
11
# -*- coding: utf-8 -*-
2-
import sys
3-
42
import pandas as pd
53
from pandas import api
64
from pandas.util import testing as tm
@@ -53,10 +51,10 @@ class TestPDApi(Base):
5351
]
5452

5553
# these are already deprecated; awaiting removal
56-
deprecated_classes = ['TimeGrouper']
54+
deprecated_classes = ['TimeGrouper', 'Panel']
5755

5856
# these should be deprecated in the future
59-
deprecated_classes_in_future = ['Panel']
57+
deprecated_classes_in_future = []
6058

6159
# external modules exposed in pandas namespace
6260
modules = ['np', 'datetime']
@@ -148,18 +146,3 @@ def test_deprecation_cdaterange(self):
148146
with tm.assert_produces_warning(FutureWarning,
149147
check_stacklevel=False):
150148
cdate_range('2017-01-01', '2017-12-31')
151-
152-
153-
class TestCategoricalMove(object):
154-
155-
def test_categorical_move(self):
156-
# May have been cached by another import, e.g. pickle tests.
157-
sys.modules.pop("pandas.core.categorical", None)
158-
159-
with tm.assert_produces_warning(FutureWarning):
160-
from pandas.core.categorical import Categorical # noqa
161-
162-
sys.modules.pop("pandas.core.categorical", None)
163-
164-
with tm.assert_produces_warning(FutureWarning):
165-
from pandas.core.categorical import CategoricalDtype # noqa

pandas/tests/computation/test_eval.py

+1-9
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
from pandas.core.dtypes.common import is_bool, is_list_like, is_scalar
1515

1616
import pandas as pd
17-
from pandas import DataFrame, Panel, Series, date_range
17+
from pandas import DataFrame, Series, date_range
1818
from pandas.core.computation import pytables
1919
from pandas.core.computation.check import _NUMEXPR_VERSION
2020
from pandas.core.computation.engines import NumExprClobberingError, _engines
@@ -1112,14 +1112,6 @@ def test_bool_ops_with_constants(self):
11121112
exp = eval(ex)
11131113
assert res == exp
11141114

1115-
@pytest.mark.filterwarnings("ignore::FutureWarning")
1116-
def test_panel_fails(self):
1117-
x = Panel(randn(3, 4, 5))
1118-
y = Series(randn(10))
1119-
with pytest.raises(NotImplementedError):
1120-
self.eval('x + y',
1121-
local_dict={'x': x, 'y': y})
1122-
11231115
def test_4d_ndarray_fails(self):
11241116
x = randn(3, 4, 5, 6)
11251117
y = Series(randn(10))

pandas/tests/dtypes/test_inference.py

+2-7
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@
1111
from fractions import Fraction
1212
from numbers import Number
1313
import re
14-
from warnings import catch_warnings, simplefilter
1514

1615
import numpy as np
1716
import pytest
@@ -30,8 +29,8 @@
3029

3130
import pandas as pd
3231
from pandas import (
33-
Categorical, DataFrame, DateOffset, DatetimeIndex, Index, Interval, Panel,
34-
Period, Series, Timedelta, TimedeltaIndex, Timestamp, compat, isna)
32+
Categorical, DataFrame, DateOffset, DatetimeIndex, Index, Interval, Period,
33+
Series, Timedelta, TimedeltaIndex, Timestamp, compat, isna)
3534
from pandas.util import testing as tm
3635

3736

@@ -1305,10 +1304,6 @@ def test_is_scalar_pandas_containers(self):
13051304
assert not is_scalar(Series([1]))
13061305
assert not is_scalar(DataFrame())
13071306
assert not is_scalar(DataFrame([[1]]))
1308-
with catch_warnings(record=True):
1309-
simplefilter("ignore", FutureWarning)
1310-
assert not is_scalar(Panel())
1311-
assert not is_scalar(Panel([[[1]]]))
13121307
assert not is_scalar(Index([]))
13131308
assert not is_scalar(Index([1]))
13141309

pandas/tests/frame/test_dtypes.py

+6-2
Original file line numberDiff line numberDiff line change
@@ -808,11 +808,15 @@ def test_astype_to_incorrect_datetimelike(self, unit):
808808
other = "m8[{}]".format(unit)
809809

810810
df = DataFrame(np.array([[1, 2, 3]], dtype=dtype))
811-
with pytest.raises(TypeError):
811+
msg = (r"cannot astype a datetimelike from \[datetime64\[ns\]\] to"
812+
r" \[timedelta64\[{}\]\]").format(unit)
813+
with pytest.raises(TypeError, match=msg):
812814
df.astype(other)
813815

816+
msg = (r"cannot astype a timedelta from \[timedelta64\[ns\]\] to"
817+
r" \[datetime64\[{}\]\]").format(unit)
814818
df = DataFrame(np.array([[1, 2, 3]], dtype=other))
815-
with pytest.raises(TypeError):
819+
with pytest.raises(TypeError, match=msg):
816820
df.astype(dtype)
817821

818822
def test_timedeltas(self):

0 commit comments

Comments
 (0)