-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: combine_first does not retain dtypes with Timestamp DataFrames #38145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: combine_first does not retain dtypes with Timestamp DataFrames #38145
Conversation
Why does combin_first use expressions.where instead of mgr.where? Casting should be handled correctly there |
That line hasn't been touched since #17744, will look |
Now that i have an actual keyboard: this would likely also require fixing Block.where #38073 |
Ok - in this case this would sit on top of that I'm struggling a little with the pattern. Atm I have: def combine_first(self, other: DataFrame) -> DataFrame:
def combiner(x, y):
mask = extract_array(isna(x))
x_values = extract_array(x, extract_numpy=True)
y_values = extract_array(y, extract_numpy=True)
# If the column y in other DataFrame is not in first DataFrame,
# just return y_values.
if y.name not in self.columns:
return y_values
_where = self._mgr.where(y_values, mask, align=True, errors="raise", try_cast=True, axis=1)
# extract array from _where
return self.combine(other, combiner, overwrite=False) Is this in the right ballpark? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that i have an actual keyboard: this would likely also require fixing Block.where #38073
@jbrockmendel is the current patch not good for 1.2? (agreed could use some refactoring).
"scalar1, scalar2", | ||
[ | ||
(datetime(2020, 1, 1), datetime(2020, 1, 2)), | ||
(pd.Period("2020-01-01", "D"), pd.Period("2020-01-02", "D")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can add an Interval example here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
This looks good for now |
@arw2019 ok small test comment. ping on green. |
…28481-combine_first_timestamp
thanks @arw2019 |
@arw2019 do you have time to try to make this go through |
yes, in the works |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Picking up #35514