Skip to content

BUG: std with array_manager and NaT results #51446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rhshadrach opened this issue Feb 17, 2023 · 5 comments
Closed

BUG: std with array_manager and NaT results #51446

rhshadrach opened this issue Feb 17, 2023 · 5 comments
Labels
Bug DataFrame DataFrame data structure Dtype Conversions Unexpected or buggy dtype conversions Reduction Operations sum, mean, min, max, etc.

Comments

@rhshadrach
Copy link
Member

When a reduction results in NaT, the array manager reduction assumes the output dtype is the same as the input dtype. This is not correct for std (and from what I can tell, only std is incorrect).

df = pd.DataFrame({'a': ['2022-01-01', '2022-01-02', pd.NaT, '2022-01-03']})
df['a'] = pd.to_datetime(df['a'])

result = df.std(skipna=True)
print(result)
# a   1 days
# dtype: timedelta64[ns]

result2 = df.std(skipna=False)
print(result2)
# a   NaT
# dtype: datetime64[ns]
@rhshadrach rhshadrach added Bug Dtype Conversions Unexpected or buggy dtype conversions DataFrame DataFrame data structure Reduction Operations sum, mean, min, max, etc. ArrayManager labels Feb 17, 2023
@jbrockmendel
Copy link
Member

@rhshadrach im working on a branch that implements keepdims that would fix this. is there an extant test for this somewhere?

@jbrockmendel
Copy link
Member

Looks like an xfail in test_std_datetime64_with_nat

@topper-123
Copy link
Contributor

Just checked and this is not fixed by #52788 ATM so probably std needs some dtype info. I' pretty sure this is fixable after #52788, so can be done in a follow-up.

@jbrockmendel
Copy link
Member

df.std() now raises ValueError with ArrayManager

@rhshadrach
Copy link
Member Author

The ArrayManager is now deprecated. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug DataFrame DataFrame data structure Dtype Conversions Unexpected or buggy dtype conversions Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants