Skip to content

Commit 5c7acc0

Browse files
SaturnFromTitanadmin
authored and
admin
committed
added FutureWarning to empty Series without dtype and adjusted the tests and docs so that no unnecessary warnings are thrown
1 parent ed20822 commit 5c7acc0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+403
-242
lines changed

doc/source/user_guide/missing_data.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -190,15 +190,15 @@ The sum of an empty or all-NA Series or column of a DataFrame is 0.
190190
191191
pd.Series([np.nan]).sum()
192192
193-
pd.Series([]).sum()
193+
pd.Series([], dtype="float64").sum()
194194
195195
The product of an empty or all-NA Series or column of a DataFrame is 1.
196196

197197
.. ipython:: python
198198
199199
pd.Series([np.nan]).prod()
200200
201-
pd.Series([]).prod()
201+
pd.Series([], dtype="float64").prod()
202202
203203
204204
NA values in GroupBy

doc/source/user_guide/scale.rst

+1
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,7 @@ results will fit in memory, so we can safely call ``compute`` without running
358358
out of memory. At that point it's just a regular pandas object.
359359

360360
.. ipython:: python
361+
:okwarning:
361362
362363
@savefig dask_resample.png
363364
ddf[['x', 'y']].resample("1D").mean().cumsum().compute().plot()

doc/source/whatsnew/v0.19.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -707,6 +707,7 @@ A ``Series`` will now correctly promote its dtype for assignment with incompat v
707707

708708

709709
.. ipython:: python
710+
:okwarning:
710711
711712
s = pd.Series()
712713

doc/source/whatsnew/v0.21.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -428,6 +428,7 @@ Note that this also changes the sum of an empty ``Series``. Previously this alwa
428428
but for consistency with the all-NaN case, this was changed to return NaN as well:
429429

430430
.. ipython:: python
431+
:okwarning:
431432
432433
pd.Series([]).sum()
433434

doc/source/whatsnew/v0.22.0.rst

+3
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ The default sum for empty or all-*NA* ``Series`` is now ``0``.
5555
*pandas 0.22.0*
5656

5757
.. ipython:: python
58+
:okwarning:
5859
5960
pd.Series([]).sum()
6061
pd.Series([np.nan]).sum()
@@ -67,6 +68,7 @@ pandas 0.20.3 without bottleneck, or pandas 0.21.x), use the ``min_count``
6768
keyword.
6869

6970
.. ipython:: python
71+
:okwarning:
7072
7173
pd.Series([]).sum(min_count=1)
7274
@@ -85,6 +87,7 @@ required for a non-NA sum or product.
8587
returning ``1`` instead.
8688

8789
.. ipython:: python
90+
:okwarning:
8891
8992
pd.Series([]).prod()
9093
pd.Series([np.nan]).prod()

doc/source/whatsnew/v1.0.0.rst

+18-1
Original file line numberDiff line numberDiff line change
@@ -356,6 +356,23 @@ When :class:`Categorical` contains ``np.nan``,
356356
357357
pd.Categorical([1, 2, np.nan], ordered=True).min()
358358
359+
360+
Default dtype of empty :class:`pandas.core.series.Series`
361+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
362+
363+
Initialising an empty :class:`pandas.core.series.Series` without specifying a dtype will raise a `FutureWarning` now
364+
(:issue:`17261`). The default dtype will change from ``float64`` to ``object`` in future releases so that it is
365+
consistent with the behaviour of :class:`DataFrame` and :class:`Index`.
366+
367+
*pandas 1.0.0*
368+
369+
.. code-block:: ipython
370+
371+
In [1]: pd.Series()
372+
Out[2]:
373+
FutureWarning: The default dtype for empty Series will be 'object' instead of 'float64' in the next version. Specify a dtype explicitly to silence this warning.
374+
Series([], dtype: float64)
375+
359376
.. _whatsnew_1000.api_breaking.deps:
360377

361378
Increased minimum versions for dependencies
@@ -484,7 +501,7 @@ Removal of prior version deprecations/changes
484501

485502
Previously, pandas would register converters with matplotlib as a side effect of importing pandas (:issue:`18720`).
486503
This changed the output of plots made via matplotlib plots after pandas was imported, even if you were using
487-
matplotlib directly rather than rather than :meth:`~DataFrame.plot`.
504+
matplotlib directly rather than :meth:`~DataFrame.plot`.
488505

489506
To use pandas formatters with a matplotlib plot, specify
490507

pandas/compat/pickle_compat.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ def __new__(cls) -> "Series": # type: ignore
6464
stacklevel=6,
6565
)
6666

67-
return Series()
67+
return Series(dtype=object)
6868

6969

7070
class _LoadSparseFrame:

pandas/core/algorithms.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -601,7 +601,7 @@ def _factorize_array(
601601
)
602602
@Appender(_shared_docs["factorize"])
603603
def factorize(
604-
values, sort: bool = False, na_sentinel: int = -1, size_hint: Optional[int] = None,
604+
values, sort: bool = False, na_sentinel: int = -1, size_hint: Optional[int] = None
605605
) -> Tuple[np.ndarray, Union[np.ndarray, ABCIndex]]:
606606
# Implementation notes: This method is responsible for 3 things
607607
# 1.) coercing data to array-like (ndarray, Index, extension array)

pandas/core/apply.py

+16-3
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@
1515
)
1616
from pandas.core.dtypes.generic import ABCMultiIndex, ABCSeries
1717

18+
from pandas.core.construction import create_series_with_explicit_dtype
19+
1820
if TYPE_CHECKING:
1921
from pandas import DataFrame, Series, Index
2022

@@ -203,15 +205,15 @@ def apply_empty_result(self):
203205

204206
if not should_reduce:
205207
try:
206-
r = self.f(Series([]))
208+
r = self.f(Series([], dtype=np.float64))
207209
except Exception:
208210
pass
209211
else:
210212
should_reduce = not isinstance(r, Series)
211213

212214
if should_reduce:
213215
if len(self.agg_axis):
214-
r = self.f(Series([]))
216+
r = self.f(Series([], dtype=np.float64))
215217
else:
216218
r = np.nan
217219

@@ -346,14 +348,25 @@ def apply_series_generator(self) -> Tuple[ResType, "Index"]:
346348
def wrap_results(
347349
self, results: ResType, res_index: "Index"
348350
) -> Union["Series", "DataFrame"]:
351+
from pandas import Series
349352

350353
# see if we can infer the results
351354
if len(results) > 0 and 0 in results and is_sequence(results[0]):
352355

353356
return self.wrap_results_for_axis(results, res_index)
354357

355358
# dict of scalars
356-
result = self.obj._constructor_sliced(results)
359+
360+
# the default dtype of an empty Series will be `object`, but this
361+
# code can be hit by df.mean() where the result should have dtype
362+
# float64 even if it's an empty Series.
363+
constructor_sliced = self.obj._constructor_sliced
364+
if constructor_sliced is Series:
365+
result = create_series_with_explicit_dtype(
366+
results, dtype_if_empty=np.float64
367+
)
368+
else:
369+
result = constructor_sliced(results)
357370
result.index = res_index
358371

359372
return result

pandas/core/base.py

+8-2
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
from pandas.core.accessor import DirNamesMixin
3535
from pandas.core.algorithms import duplicated, unique1d, value_counts
3636
from pandas.core.arrays import ExtensionArray
37+
from pandas.core.construction import create_series_with_explicit_dtype
3738
import pandas.core.nanops as nanops
3839

3940
_shared_docs: Dict[str, str] = dict()
@@ -1143,9 +1144,14 @@ def _map_values(self, mapper, na_action=None):
11431144
# convert to an Series for efficiency.
11441145
# we specify the keys here to handle the
11451146
# possibility that they are tuples
1146-
from pandas import Series
11471147

1148-
mapper = Series(mapper)
1148+
# The return value of mapping with an empty mapper is
1149+
# expected to be pd.Series(np.nan, ...). As np.nan is
1150+
# of dtype float64 the return value of this method should
1151+
# be float64 as well
1152+
mapper = create_series_with_explicit_dtype(
1153+
mapper, dtype_if_empty=np.float64
1154+
)
11491155

11501156
if isinstance(mapper, ABCSeries):
11511157
# Since values were input this means we came from either

pandas/core/construction.py

+35
Original file line numberDiff line numberDiff line change
@@ -565,3 +565,38 @@ def _try_cast(
565565
else:
566566
subarr = np.array(arr, dtype=object, copy=copy)
567567
return subarr
568+
569+
570+
# see gh-17261
571+
def is_empty_data(data):
572+
"""
573+
Utility to check if a Series is instantiated with empty data
574+
"""
575+
is_none = data is None
576+
is_simple_empty = isinstance(data, (list, tuple, dict)) and not data
577+
return is_none or is_simple_empty
578+
579+
580+
def create_series_with_explicit_dtype(
581+
data=None,
582+
index=None,
583+
dtype=None,
584+
name=None,
585+
copy=False,
586+
fastpath=False,
587+
dtype_if_empty=object,
588+
):
589+
"""
590+
Helper to pass an explicit dtype when instantiating an empty Series.
591+
592+
The signature of this function mirrors the signature of Series.__init__
593+
but adds the additional keyword argument `dtype_if_empty`.
594+
595+
This silences a FutureWarning described in the GitHub issue
596+
mentioned above.
597+
"""
598+
from pandas.core.series import Series
599+
600+
if is_empty_data(data) and dtype is None:
601+
dtype = dtype_if_empty
602+
return Series(data, index, dtype, name, copy, fastpath)

pandas/core/frame.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -7954,7 +7954,7 @@ def quantile(self, q=0.5, axis=0, numeric_only=True, interpolation="linear"):
79547954
cols = Index([], name=self.columns.name)
79557955
if is_list_like(q):
79567956
return self._constructor([], index=q, columns=cols)
7957-
return self._constructor_sliced([], index=cols, name=q)
7957+
return self._constructor_sliced([], index=cols, name=q, dtype=np.float64)
79587958

79597959
result = data._data.quantile(
79607960
qs=q, axis=1, interpolation=interpolation, transposed=is_transposed

pandas/core/generic.py

+3-4
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@
7171
import pandas.core.algorithms as algos
7272
from pandas.core.base import PandasObject, SelectionMixin
7373
import pandas.core.common as com
74+
from pandas.core.construction import create_series_with_explicit_dtype
7475
from pandas.core.index import (
7576
Index,
7677
InvalidIndexError,
@@ -6069,9 +6070,7 @@ def fillna(
60696070

60706071
if self.ndim == 1:
60716072
if isinstance(value, (dict, ABCSeries)):
6072-
from pandas import Series
6073-
6074-
value = Series(value)
6073+
value = create_series_with_explicit_dtype(value)
60756074
elif not is_list_like(value):
60766075
pass
60776076
else:
@@ -7010,7 +7009,7 @@ def asof(self, where, subset=None):
70107009
if not is_series:
70117010
from pandas import Series
70127011

7013-
return Series(index=self.columns, name=where)
7012+
return Series(index=self.columns, name=where, dtype=np.float64)
70147013
return np.nan
70157014

70167015
# It's always much faster to use a *while* loop here for

pandas/core/groupby/generic.py

+14-5
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
import pandas.core.algorithms as algorithms
5252
from pandas.core.base import DataError, SpecificationError
5353
import pandas.core.common as com
54+
from pandas.core.construction import create_series_with_explicit_dtype
5455
from pandas.core.frame import DataFrame
5556
from pandas.core.generic import ABCDataFrame, ABCSeries, NDFrame, _shared_docs
5657
from pandas.core.groupby import base
@@ -259,7 +260,7 @@ def aggregate(self, func=None, *args, **kwargs):
259260
result = self._aggregate_named(func, *args, **kwargs)
260261

261262
index = Index(sorted(result), name=self.grouper.names[0])
262-
ret = Series(result, index=index)
263+
ret = create_series_with_explicit_dtype(result, index=index)
263264

264265
if not self.as_index: # pragma: no cover
265266
print("Warning, ignoring as_index=True")
@@ -407,7 +408,7 @@ def _wrap_transformed_output(
407408
def _wrap_applied_output(self, keys, values, not_indexed_same=False):
408409
if len(keys) == 0:
409410
# GH #6265
410-
return Series([], name=self._selection_name, index=keys)
411+
return Series([], name=self._selection_name, index=keys, dtype=np.float64)
411412

412413
def _get_index() -> Index:
413414
if self.grouper.nkeys > 1:
@@ -493,7 +494,7 @@ def _transform_general(self, func, *args, **kwargs):
493494

494495
result = concat(results).sort_index()
495496
else:
496-
result = Series()
497+
result = Series(dtype=np.float64)
497498

498499
# we will only try to coerce the result type if
499500
# we have a numeric dtype, as these are *always* user-defined funcs
@@ -1205,9 +1206,17 @@ def first_not_none(values):
12051206
if v is None:
12061207
return DataFrame()
12071208
elif isinstance(v, NDFrame):
1209+
1210+
# this is to silence a FutureWarning
1211+
# TODO: Remove when default dtype of empty Series is object
1212+
kwargs = v._construct_axes_dict()
1213+
if v._constructor is Series:
1214+
is_empty = "data" not in kwargs or not kwargs["data"]
1215+
if "dtype" not in kwargs and is_empty:
1216+
kwargs["dtype"] = object
1217+
12081218
values = [
1209-
x if x is not None else v._constructor(**v._construct_axes_dict())
1210-
for x in values
1219+
x if (x is not None) else v._constructor(**kwargs) for x in values
12111220
]
12121221

12131222
v = values[0]

pandas/core/series.py

+21-2
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,12 @@
5454
from pandas.core.arrays.categorical import Categorical, CategoricalAccessor
5555
from pandas.core.arrays.sparse import SparseAccessor
5656
import pandas.core.common as com
57-
from pandas.core.construction import extract_array, sanitize_array
57+
from pandas.core.construction import (
58+
create_series_with_explicit_dtype,
59+
extract_array,
60+
is_empty_data,
61+
sanitize_array,
62+
)
5863
from pandas.core.index import (
5964
Float64Index,
6065
Index,
@@ -175,6 +180,18 @@ class Series(base.IndexOpsMixin, generic.NDFrame):
175180
def __init__(
176181
self, data=None, index=None, dtype=None, name=None, copy=False, fastpath=False
177182
):
183+
if is_empty_data(data) and dtype is None:
184+
# Empty Series should have dtype object to be consistent
185+
# with the behaviour of DataFrame and Index
186+
warnings.warn(
187+
"The default dtype for empty Series will be 'object' instead"
188+
" of 'float64' in the next version. Specify a dtype explicitly"
189+
" to silence this warning.",
190+
FutureWarning,
191+
stacklevel=2,
192+
)
193+
# uncomment the line below when removing the FutureWarning
194+
# dtype = np.dtype(object)
178195

179196
# we are called internally, so short-circuit
180197
if fastpath:
@@ -328,7 +345,9 @@ def _init_dict(self, data, index=None, dtype=None):
328345
keys, values = [], []
329346

330347
# Input is now list-like, so rely on "standard" construction:
331-
s = Series(values, index=keys, dtype=dtype)
348+
s = create_series_with_explicit_dtype(
349+
values, index=keys, dtype=dtype, dtype_if_empty=np.float64
350+
)
332351

333352
# Now we just make sure the order is respected, if any
334353
if data and index is not None:

pandas/core/tools/datetimes.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,8 @@ def _maybe_cache(arg, format, cache, convert_listlike):
145145
"""
146146
from pandas import Series
147147

148-
cache_array = Series()
148+
cache_array = Series(dtype=object)
149+
149150
if cache:
150151
# Perform a quicker unique check
151152
if not should_cache(arg):

pandas/io/html.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414

1515
from pandas.core.dtypes.common import is_list_like
1616

17-
from pandas import Series
17+
from pandas.core.construction import create_series_with_explicit_dtype
1818

1919
from pandas.io.common import _is_url, _validate_header_arg, urlopen
2020
from pandas.io.formats.printing import pprint_thing
@@ -762,7 +762,8 @@ def _parse_tfoot_tr(self, table):
762762

763763

764764
def _expand_elements(body):
765-
lens = Series([len(elem) for elem in body])
765+
data = [len(elem) for elem in body]
766+
lens = create_series_with_explicit_dtype(data)
766767
lens_max = lens.max()
767768
not_max = lens[lens != lens_max]
768769

0 commit comments

Comments
 (0)