Skip to content

Commit 41108bd

Browse files
authored
Merge branch 'main' into dep_excel
2 parents 98d2b1e + fe93a83 commit 41108bd

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+96
-1164
lines changed

doc/source/reference/arrays.rst

-1
Original file line numberDiff line numberDiff line change
@@ -630,7 +630,6 @@ Data type introspection
630630
api.types.is_datetime64_dtype
631631
api.types.is_datetime64_ns_dtype
632632
api.types.is_datetime64tz_dtype
633-
api.types.is_extension_type
634633
api.types.is_extension_array_dtype
635634
api.types.is_float_dtype
636635
api.types.is_int64_dtype

doc/source/user_guide/io.rst

-70
Original file line numberDiff line numberDiff line change
@@ -2111,8 +2111,6 @@ is ``None``. To explicitly force ``Series`` parsing, pass ``typ=series``
21112111
* ``convert_axes`` : boolean, try to convert the axes to the proper dtypes, default is ``True``
21122112
* ``convert_dates`` : a list of columns to parse for dates; If ``True``, then try to parse date-like columns, default is ``True``.
21132113
* ``keep_default_dates`` : boolean, default ``True``. If parsing dates, then parse the default date-like columns.
2114-
* ``numpy`` : direct decoding to NumPy arrays. default is ``False``;
2115-
Supports numeric data only, although labels may be non-numeric. Also note that the JSON ordering **MUST** be the same for each term if ``numpy=True``.
21162114
* ``precise_float`` : boolean, default ``False``. Set to enable usage of higher precision (strtod) function when decoding string to double values. Default (``False``) is to use fast but less precise builtin functionality.
21172115
* ``date_unit`` : string, the timestamp unit to detect if converting dates. Default
21182116
None. By default the timestamp precision will be detected, if this is not desired
@@ -2216,74 +2214,6 @@ Dates written in nanoseconds need to be read back in nanoseconds:
22162214
dfju = pd.read_json(json, date_unit="ns")
22172215
dfju
22182216
2219-
The Numpy parameter
2220-
+++++++++++++++++++
2221-
2222-
.. note::
2223-
This param has been deprecated as of version 1.0.0 and will raise a ``FutureWarning``.
2224-
2225-
This supports numeric data only. Index and columns labels may be non-numeric, e.g. strings, dates etc.
2226-
2227-
If ``numpy=True`` is passed to ``read_json`` an attempt will be made to sniff
2228-
an appropriate dtype during deserialization and to subsequently decode directly
2229-
to NumPy arrays, bypassing the need for intermediate Python objects.
2230-
2231-
This can provide speedups if you are deserialising a large amount of numeric
2232-
data:
2233-
2234-
.. ipython:: python
2235-
2236-
randfloats = np.random.uniform(-100, 1000, 10000)
2237-
randfloats.shape = (1000, 10)
2238-
dffloats = pd.DataFrame(randfloats, columns=list("ABCDEFGHIJ"))
2239-
2240-
jsonfloats = dffloats.to_json()
2241-
2242-
.. ipython:: python
2243-
2244-
%timeit pd.read_json(jsonfloats)
2245-
2246-
.. ipython:: python
2247-
:okwarning:
2248-
2249-
%timeit pd.read_json(jsonfloats, numpy=True)
2250-
2251-
The speedup is less noticeable for smaller datasets:
2252-
2253-
.. ipython:: python
2254-
2255-
jsonfloats = dffloats.head(100).to_json()
2256-
2257-
.. ipython:: python
2258-
2259-
%timeit pd.read_json(jsonfloats)
2260-
2261-
.. ipython:: python
2262-
:okwarning:
2263-
2264-
%timeit pd.read_json(jsonfloats, numpy=True)
2265-
2266-
.. warning::
2267-
2268-
Direct NumPy decoding makes a number of assumptions and may fail or produce
2269-
unexpected output if these assumptions are not satisfied:
2270-
2271-
- data is numeric.
2272-
2273-
- data is uniform. The dtype is sniffed from the first value decoded.
2274-
A ``ValueError`` may be raised, or incorrect output may be produced
2275-
if this condition is not satisfied.
2276-
2277-
- labels are ordered. Labels are only read from the first container, it is assumed
2278-
that each subsequent row / column has been encoded in the same order. This should be satisfied if the
2279-
data was encoded using ``to_json`` but may not be the case if the JSON
2280-
is from another source.
2281-
2282-
.. ipython:: python
2283-
:suppress:
2284-
2285-
os.remove("test.json")
2286-
22872217
.. _io.json_normalize:
22882218

22892219
Normalization

doc/source/whatsnew/v2.0.0.rst

+9
Original file line numberDiff line numberDiff line change
@@ -144,10 +144,19 @@ Deprecations
144144

145145
Removal of prior version deprecations/changes
146146
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147+
- Remove argument ``squeeze`` from :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`32380`)
148+
- Removed ``keep_tz`` argument in :meth:`DatetimeIndex.to_series` (:issue:`29731`)
149+
- Remove arguments ``names`` and ``dtype`` from :meth:`Index.copy` and ``levels`` and ``codes`` from :meth:`MultiIndex.copy` (:issue:`35853`, :issue:`36685`)
150+
- Removed argument ``try_cast`` from :meth:`DataFrame.mask`, :meth:`DataFrame.where`, :meth:`Series.mask` and :meth:`Series.where` (:issue:`38836`)
147151
- Disallow passing non-round floats to :class:`Timestamp` with ``unit="M"`` or ``unit="Y"`` (:issue:`47266`)
148152
- Remove keywords ``convert_float`` and ``mangle_dupe_cols`` from :func:`read_excel` (:issue:`41176`)
149153
- Disallow passing non-keyword arguments to :func:`read_excel` except ``io`` and ``sheet_name`` (:issue:`34418`)
154+
- Removed the ``numeric_only`` keyword from :meth:`Categorical.min` and :meth:`Categorical.max` in favor of ``skipna`` (:issue:`48821`)
155+
- Removed :func:`is_extension_type` in favor of :func:`is_extension_array_dtype` (:issue:`29457`)
150156
- Remove :meth:`DataFrameGroupBy.pad` and :meth:`DataFrameGroupBy.backfill` (:issue:`45076`)
157+
- Remove ``numpy`` argument from :func:`read_json` (:issue:`30636`)
158+
- Removed the ``center`` keyword in :meth:`DataFrame.expanding` (:issue:`20647`)
159+
- Enforced :meth:`Rolling.count` with ``min_periods=None`` to default to the size of the window (:issue:`31302`)
151160

152161
.. ---------------------------------------------------------------------------
153162
.. _whatsnew_200.performance:

pandas/_libs/src/ujson/python/JSONtoObj.c

+3-84
Original file line numberDiff line numberDiff line change
@@ -83,12 +83,6 @@ JSOBJ Object_npyNewArrayList(void *prv, void *decoder);
8383
JSOBJ Object_npyEndArrayList(void *prv, JSOBJ obj);
8484
int Object_npyArrayListAddItem(void *prv, JSOBJ obj, JSOBJ value);
8585

86-
// labelled support, encode keys and values of JS object into separate numpy
87-
// arrays
88-
JSOBJ Object_npyNewObject(void *prv, void *decoder);
89-
JSOBJ Object_npyEndObject(void *prv, JSOBJ obj);
90-
int Object_npyObjectAddKey(void *prv, JSOBJ obj, JSOBJ name, JSOBJ value);
91-
9286
// free the numpy context buffer
9387
void Npy_releaseContext(NpyArrContext *npyarr) {
9488
PRINTMARK();
@@ -374,68 +368,6 @@ int Object_npyArrayListAddItem(void *prv, JSOBJ obj, JSOBJ value) {
374368
return 1;
375369
}
376370

377-
JSOBJ Object_npyNewObject(void *prv, void *_decoder) {
378-
PyObjectDecoder *decoder = (PyObjectDecoder *)_decoder;
379-
PRINTMARK();
380-
if (decoder->curdim > 1) {
381-
PyErr_SetString(PyExc_ValueError,
382-
"labels only supported up to 2 dimensions");
383-
return NULL;
384-
}
385-
386-
return ((JSONObjectDecoder *)decoder)->newArray(prv, decoder);
387-
}
388-
389-
JSOBJ Object_npyEndObject(void *prv, JSOBJ obj) {
390-
PyObject *list;
391-
npy_intp labelidx;
392-
NpyArrContext *npyarr = (NpyArrContext *)obj;
393-
PRINTMARK();
394-
if (!npyarr) {
395-
return NULL;
396-
}
397-
398-
labelidx = npyarr->dec->curdim - 1;
399-
400-
list = npyarr->labels[labelidx];
401-
if (list) {
402-
npyarr->labels[labelidx] = PyArray_FROM_O(list);
403-
Py_DECREF(list);
404-
}
405-
406-
return (PyObject *)((JSONObjectDecoder *)npyarr->dec)->endArray(prv, obj);
407-
}
408-
409-
int Object_npyObjectAddKey(void *prv, JSOBJ obj, JSOBJ name, JSOBJ value) {
410-
PyObject *label, *labels;
411-
npy_intp labelidx;
412-
// add key to label array, value to values array
413-
NpyArrContext *npyarr = (NpyArrContext *)obj;
414-
PRINTMARK();
415-
if (!npyarr) {
416-
return 0;
417-
}
418-
419-
label = (PyObject *)name;
420-
labelidx = npyarr->dec->curdim - 1;
421-
422-
if (!npyarr->labels[labelidx]) {
423-
npyarr->labels[labelidx] = PyList_New(0);
424-
}
425-
labels = npyarr->labels[labelidx];
426-
// only fill label array once, assumes all column labels are the same
427-
// for 2-dimensional arrays.
428-
if (PyList_Check(labels) && PyList_GET_SIZE(labels) <= npyarr->elcount) {
429-
PyList_Append(labels, label);
430-
}
431-
432-
if (((JSONObjectDecoder *)npyarr->dec)->arrayAddItem(prv, obj, value)) {
433-
Py_DECREF(label);
434-
return 1;
435-
}
436-
return 0;
437-
}
438-
439371
int Object_objectAddKey(void *prv, JSOBJ obj, JSOBJ name, JSOBJ value) {
440372
int ret = PyDict_SetItem(obj, name, value);
441373
Py_DECREF((PyObject *)name);
@@ -494,7 +426,7 @@ static void Object_releaseObject(void *prv, JSOBJ obj, void *_decoder) {
494426
}
495427
}
496428

497-
static char *g_kwlist[] = {"obj", "precise_float", "numpy",
429+
static char *g_kwlist[] = {"obj", "precise_float",
498430
"labelled", "dtype", NULL};
499431

500432
PyObject *JSONToObj(PyObject *self, PyObject *args, PyObject *kwargs) {
@@ -505,7 +437,7 @@ PyObject *JSONToObj(PyObject *self, PyObject *args, PyObject *kwargs) {
505437
JSONObjectDecoder *decoder;
506438
PyObjectDecoder pyDecoder;
507439
PyArray_Descr *dtype = NULL;
508-
int numpy = 0, labelled = 0;
440+
int labelled = 0;
509441

510442
JSONObjectDecoder dec = {
511443
Object_newString, Object_objectAddKey, Object_arrayAddItem,
@@ -528,7 +460,7 @@ PyObject *JSONToObj(PyObject *self, PyObject *args, PyObject *kwargs) {
528460
decoder = (JSONObjectDecoder *)&pyDecoder;
529461

530462
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|OiiO&", g_kwlist, &arg,
531-
&opreciseFloat, &numpy, &labelled,
463+
&opreciseFloat, &labelled,
532464
PyArray_DescrConverter2, &dtype)) {
533465
Npy_releaseContext(pyDecoder.npyarr);
534466
return NULL;
@@ -554,19 +486,6 @@ PyObject *JSONToObj(PyObject *self, PyObject *args, PyObject *kwargs) {
554486
decoder->errorStr = NULL;
555487
decoder->errorOffset = NULL;
556488

557-
if (numpy) {
558-
pyDecoder.dtype = dtype;
559-
decoder->newArray = Object_npyNewArray;
560-
decoder->endArray = Object_npyEndArray;
561-
decoder->arrayAddItem = Object_npyArrayAddItem;
562-
563-
if (labelled) {
564-
decoder->newObject = Object_npyNewObject;
565-
decoder->endObject = Object_npyEndObject;
566-
decoder->objectAddKey = Object_npyObjectAddKey;
567-
}
568-
}
569-
570489
ret = JSON_DecodeObject(decoder, PyBytes_AS_STRING(sarg),
571490
PyBytes_GET_SIZE(sarg));
572491

pandas/conftest.py

-1
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,6 @@ def pytest_collection_modifyitems(items, config) -> None:
155155
("Series.append", "The series.append method is deprecated"),
156156
("dtypes.common.is_categorical", "is_categorical is deprecated"),
157157
("Categorical.replace", "Categorical.replace is deprecated"),
158-
("dtypes.common.is_extension_type", "'is_extension_type' is deprecated"),
159158
("Index.is_mixed", "Index.is_mixed is deprecated"),
160159
("MultiIndex._is_lexsorted", "MultiIndex.is_lexsorted is deprecated"),
161160
# Docstring divides by zero to show behavior difference

pandas/core/algorithms.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -462,13 +462,13 @@ def isin(comps: AnyArrayLike, values: AnyArrayLike) -> npt.NDArray[np.bool_]:
462462
)
463463

464464
if not isinstance(values, (ABCIndex, ABCSeries, ABCExtensionArray, np.ndarray)):
465-
orig_values = values
466-
values = _ensure_arraylike(list(values))
465+
orig_values = list(values)
466+
values = _ensure_arraylike(orig_values)
467467

468468
if is_numeric_dtype(values) and not is_signed_integer_dtype(comps):
469469
# GH#46485 Use object to avoid upcast to float64 later
470470
# TODO: Share with _find_common_type_compat
471-
values = construct_1d_object_array_from_listlike(list(orig_values))
471+
values = construct_1d_object_array_from_listlike(orig_values)
472472

473473
elif isinstance(values, ABCMultiIndex):
474474
# Avoid raising in extract_array

pandas/core/arrays/categorical.py

+1-6
Original file line numberDiff line numberDiff line change
@@ -48,10 +48,7 @@
4848
type_t,
4949
)
5050
from pandas.compat.numpy import function as nv
51-
from pandas.util._decorators import (
52-
deprecate_kwarg,
53-
deprecate_nonkeyword_arguments,
54-
)
51+
from pandas.util._decorators import deprecate_nonkeyword_arguments
5552
from pandas.util._exceptions import find_stack_level
5653
from pandas.util._validators import validate_bool_kwarg
5754

@@ -2313,7 +2310,6 @@ def _reverse_indexer(self) -> dict[Hashable, npt.NDArray[np.intp]]:
23132310
# ------------------------------------------------------------------
23142311
# Reductions
23152312

2316-
@deprecate_kwarg(old_arg_name="numeric_only", new_arg_name="skipna")
23172313
def min(self, *, skipna: bool = True, **kwargs):
23182314
"""
23192315
The minimum value of the object.
@@ -2350,7 +2346,6 @@ def min(self, *, skipna: bool = True, **kwargs):
23502346
pointer = self._codes.min()
23512347
return self._wrap_reduction_result(None, pointer)
23522348

2353-
@deprecate_kwarg(old_arg_name="numeric_only", new_arg_name="skipna")
23542349
def max(self, *, skipna: bool = True, **kwargs):
23552350
"""
23562351
The maximum value of the object.

pandas/core/arrays/sparse/array.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1383,7 +1383,7 @@ def map(self: SparseArrayT, mapper) -> SparseArrayT:
13831383
Indices: array([1, 2], dtype=int32)
13841384
"""
13851385
# this is used in apply.
1386-
# We get hit since we're an "is_extension_type" but regular extension
1386+
# We get hit since we're an "is_extension_array_dtype" but regular extension
13871387
# types are not hit. This may be worth adding to the interface.
13881388
if isinstance(mapper, ABCSeries):
13891389
mapper = mapper.to_dict()

pandas/core/dtypes/api.py

-2
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@
1313
is_dict_like,
1414
is_dtype_equal,
1515
is_extension_array_dtype,
16-
is_extension_type,
1716
is_file_like,
1817
is_float,
1918
is_float_dtype,
@@ -57,7 +56,6 @@
5756
"is_dict_like",
5857
"is_dtype_equal",
5958
"is_extension_array_dtype",
60-
"is_extension_type",
6159
"is_file_like",
6260
"is_float",
6361
"is_float_dtype",

pandas/core/dtypes/common.py

-66
Original file line numberDiff line numberDiff line change
@@ -1336,71 +1336,6 @@ def is_bool_dtype(arr_or_dtype) -> bool:
13361336
return issubclass(dtype.type, np.bool_)
13371337

13381338

1339-
def is_extension_type(arr) -> bool:
1340-
"""
1341-
Check whether an array-like is of a pandas extension class instance.
1342-
1343-
.. deprecated:: 1.0.0
1344-
Use ``is_extension_array_dtype`` instead.
1345-
1346-
Extension classes include categoricals, pandas sparse objects (i.e.
1347-
classes represented within the pandas library and not ones external
1348-
to it like scipy sparse matrices), and datetime-like arrays.
1349-
1350-
Parameters
1351-
----------
1352-
arr : array-like, scalar
1353-
The array-like to check.
1354-
1355-
Returns
1356-
-------
1357-
boolean
1358-
Whether or not the array-like is of a pandas extension class instance.
1359-
1360-
Examples
1361-
--------
1362-
>>> is_extension_type([1, 2, 3])
1363-
False
1364-
>>> is_extension_type(np.array([1, 2, 3]))
1365-
False
1366-
>>>
1367-
>>> cat = pd.Categorical([1, 2, 3])
1368-
>>>
1369-
>>> is_extension_type(cat)
1370-
True
1371-
>>> is_extension_type(pd.Series(cat))
1372-
True
1373-
>>> is_extension_type(pd.arrays.SparseArray([1, 2, 3]))
1374-
True
1375-
>>> from scipy.sparse import bsr_matrix
1376-
>>> is_extension_type(bsr_matrix([1, 2, 3]))
1377-
False
1378-
>>> is_extension_type(pd.DatetimeIndex([1, 2, 3]))
1379-
False
1380-
>>> is_extension_type(pd.DatetimeIndex([1, 2, 3], tz="US/Eastern"))
1381-
True
1382-
>>>
1383-
>>> dtype = DatetimeTZDtype("ns", tz="US/Eastern")
1384-
>>> s = pd.Series([], dtype=dtype)
1385-
>>> is_extension_type(s)
1386-
True
1387-
"""
1388-
warnings.warn(
1389-
"'is_extension_type' is deprecated and will be removed in a future "
1390-
"version. Use 'is_extension_array_dtype' instead.",
1391-
FutureWarning,
1392-
stacklevel=find_stack_level(),
1393-
)
1394-
1395-
if is_categorical_dtype(arr):
1396-
return True
1397-
elif is_sparse(arr):
1398-
return True
1399-
elif is_datetime64tz_dtype(arr):
1400-
return True
1401-
return False
1402-
1403-
14041339
def is_1d_only_ea_obj(obj: Any) -> bool:
14051340
"""
14061341
ExtensionArray that does not support 2D, or more specifically that does
@@ -1853,7 +1788,6 @@ def is_all_strings(value: ArrayLike) -> bool:
18531788
"is_dtype_equal",
18541789
"is_ea_or_datetimelike_dtype",
18551790
"is_extension_array_dtype",
1856-
"is_extension_type",
18571791
"is_file_like",
18581792
"is_float_dtype",
18591793
"is_int64_dtype",

0 commit comments

Comments
 (0)