Skip to content

BUG: Series[timdelta64].var() should _not_ work #28289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Nov 2, 2019

Conversation

jbrockmendel
Copy link
Member

@@ -4012,6 +4012,13 @@ def _reduce(
If we have an ndarray as a value, then simply perform the operation,
otherwise delegate to the object.
"""
if name == "var" and is_timedelta64_dtype(self.dtype):
raise TypeError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there anyway you can do this in nanops instead? it is very awkward to do here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is no longer necessary, removed.

@jreback
Copy link
Contributor

jreback commented Sep 8, 2019

conflict, plus a comment above

if is_timedelta64_dtype(values.dtype):
# std is well-behaved, but var is not
if mask is None:
mask = isna(values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this might be better fixed if we have nanvar call _get_values (which already does this masking & type conversion); its the only routine which doesn't I think (well maybe sem).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with that is that we need different behavior for nanstd vs nanvar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you try to address this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, no. We have @disallow("M8", "m8") on nanvar, so if we don't do this here, we'll raise as soon as we call nanvar with non-casted values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about this code duplicatio of _get_values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll double check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, turns out we can use _get_values. updated

@jbrockmendel
Copy link
Member Author

gentle ping. more reductions to implement after this

@jreback
Copy link
Contributor

jreback commented Oct 24, 2019

I don't think my comments above were addressed. the code change is duplicative, if you can find a way to combine with the existing routines.

@jbrockmendel
Copy link
Member Author

I don't think my comments above were addressed. the code change is duplicative, if you can find a way to combine with the existing routines.

I replied to your comment suggesting that nanstd should go through nanvar: that is not a viable option.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small followup, otherwise lgtm.

@@ -300,11 +300,14 @@ def test_timedelta_ops(self):
assert result[0] == expected

# invalid ops
for op in ["skew", "kurt", "sem", "prod"]:
for op in ["skew", "kurt", "sem", "prod", "var"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's split this to a separate test that is paramterized (followon)

@jreback jreback merged commit bd8f07f into pandas-dev:master Nov 2, 2019
@jreback
Copy link
Contributor

jreback commented Nov 2, 2019

thanks

@jbrockmendel jbrockmendel deleted the tdvar branch November 2, 2019 15:37
jbrockmendel added a commit to jbrockmendel/pandas that referenced this pull request Nov 2, 2019
jreback pushed a commit that referenced this pull request Nov 2, 2019
Reksbril pushed a commit to Reksbril/pandas that referenced this pull request Nov 18, 2019
Reksbril pushed a commit to Reksbril/pandas that referenced this pull request Nov 18, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG(?): Var of Timedelta with empty / NA
4 participants