Skip to content

BUG/DEPR: combine dtype fixes #13970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -788,6 +788,7 @@ Deprecations
- ``pandas.tseries.frequencies.get_standard_freq`` is deprecated. Use ``pandas.tseries.frequencies.to_offset(freq).rule_code`` instead. (:issue:`13874`)
- ``pandas.tseries.frequencies.to_offset``'s ``freqstr`` keyword is deprecated in favor of ``freq``. (:issue:`13874`)


.. _whatsnew_0190.prior_deprecations:

Removal of prior version deprecations/changes
Expand Down Expand Up @@ -939,6 +940,7 @@ Bug Fixes


- Bug in ``Series`` comparison operators when dealing with zero dim NumPy arrays (:issue:`13006`)
- Bug in ``.combine_first`` may return incorrect ``dtype`` (:issue:`7630`, :issue:`10567`)
- Bug in ``groupby`` where ``apply`` returns different result depending on whether first result is ``None`` or not (:issue:`12824`)
- Bug in ``groupby(..).nth()`` where the group key is included inconsistently if called after ``.head()/.tail()`` (:issue:`12839`)
- Bug in ``.to_html``, ``.to_latex`` and ``.to_string`` silently ignore custom datetime formatter passed through the ``formatters`` key word (:issue:`10690`)
Expand Down
26 changes: 18 additions & 8 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,22 @@
_possibly_downcast_to_dtype,
_invalidate_string_dtypes,
_coerce_to_dtypes,
_maybe_upcast_putmask)
_maybe_upcast_putmask,
_find_common_type)
from pandas.types.common import (is_categorical_dtype,
is_object_dtype,
is_extension_type,
is_datetimetz,
is_datetime64_dtype,
is_datetime64tz_dtype,
is_bool_dtype,
is_integer_dtype,
is_float_dtype,
is_integer,
is_scalar,
is_dtype_equal,
needs_i8_conversion,
_get_dtype_from_object,
_lcd_dtypes,
_ensure_float,
_ensure_float64,
_ensure_int64,
Expand Down Expand Up @@ -3700,17 +3702,20 @@ def combine(self, other, func, fill_value=None, overwrite=True):
otherSeries[other_mask] = fill_value

# if we have different dtypes, possibily promote
new_dtype = this_dtype
if this_dtype != other_dtype:
new_dtype = _lcd_dtypes(this_dtype, other_dtype)
series = series.astype(new_dtype)
if notnull(series).all():
new_dtype = this_dtype
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in another life .combine functionaility could be pushed to block-manager (there is a combine there already, but it has a different purpose)

otherSeries = otherSeries.astype(new_dtype)
else:
new_dtype = _find_common_type([this_dtype, other_dtype])
if not is_dtype_equal(this_dtype, new_dtype):
series = series.astype(new_dtype)
if not is_dtype_equal(other_dtype, new_dtype):
otherSeries = otherSeries.astype(new_dtype)

# see if we need to be represented as i8 (datetimelike)
# try to keep us at this dtype
needs_i8_conversion_i = needs_i8_conversion(new_dtype)
if needs_i8_conversion_i:
this_dtype = new_dtype
arr = func(series, otherSeries, True)
else:
arr = func(series, otherSeries)
Expand All @@ -3721,7 +3726,12 @@ def combine(self, other, func, fill_value=None, overwrite=True):

# try to downcast back to the original dtype
if needs_i8_conversion_i:
arr = _possibly_cast_to_datetime(arr, this_dtype)
# ToDo: This conversion should be handled in
# _possibly_cast_to_datetime but the change affects lot...
if is_datetime64tz_dtype(new_dtype):
arr = DatetimeIndex._simple_new(arr, tz=new_dtype.tz)
else:
arr = _possibly_cast_to_datetime(arr, new_dtype)
else:
arr = _possibly_downcast_to_dtype(arr, this_dtype)

Expand Down
Loading