-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame[td64].sum(skipna=False) #37148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think need to respect min_count in sum
pandas/core/nanops.py
Outdated
return the_sum | ||
|
||
|
||
def mask_datetimelike_result(result, axis, mask, orig_dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
i think roughly the same fix will end up being used for #36907 |
return the_sum | ||
|
||
|
||
def _mask_datetimelike_result( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don;t you have this take the original values (rather than just the dtype) and compute the mask if needed (e.g. make it optional), rn the caller is responsible for that in multiple places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we get to here, we always need the mask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you didn't answer the question. you are adding multiple code blocks which do the same thing; if you are going to consolidate to a function then it makes sense to avoid that yes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apparently i dont understand the question. IIUC the alternative you're suggesting looks like
def _mask_datetimelike_result(result, axis, mask, orig_values):
if mask is None:
mask = isna(orig_values)
[what we have here now]
and remove the if mask is None and not skipna: mask = isna(orig_values)
on L516-517. This is 2 lines of code either way, so not a big deal. ill change it if you really care.
longer-term this should probably go into _get_values
, but i want to do that carefully since that may affect other functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i guess - i am thinking u r going to refactor this anyhow as this adds a fair amount of duplication
pandas/core/nanops.py
Outdated
return _wrap_results(ret, dtype) | ||
|
||
# otherwise return a scalar value | ||
return _wrap_results(get_median(values) if notempty else np.nan, dtype) | ||
|
||
|
||
def get_empty_reduction_result(shape, axis: int, dtype, fill_value) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type dtype, fill_value
""" | ||
The result from a reduction on an empty ndarray. | ||
""" | ||
shp = np.array(shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add Parametes to the doc-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, just pushed
updated per requests + green. several followups in the pipeline |
gentle ping; id like to re-use the helpers implemented here in PR(s) fixing other reductions |
i guess this needs a whats new note but can be a follow on |
* BUG: DataFrame[td64].sum(skipna=False) * annotate, privatize * annotate * calculate mask in mask_datetimelike_result
* BUG: DataFrame[td64].sum(skipna=False) * annotate, privatize * annotate * calculate mask in mask_datetimelike_result
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
The same underlying problem affects mean, so this fixes that too.