-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Clarify df.describe()
behavior with Timestamp columns
#56918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have confirmed that it still exists in the main branch. If fixes are indeed needed, I can take it. s = pd.Series(
[
np.datetime64("2020-01-01"),
np.datetime64("2020-01-02"),
np.datetime64("2020-01-03"),
]
)
df = pd.DataFrame(
{
"time": [
np.datetime64("2020-01-01"),
np.datetime64("2020-01-02"),
np.datetime64("2020-01-03"),
],
}
)
print(s.describe())
print(df.describe())
|
In Are |
It appears to me that pandas/pandas/core/methods/describe.py Lines 322 to 323 in 9e0b655
I think we're good just updating the docs. |
take |
Pandas version checks
main
hereLocation of the documentation
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html
Documentation problem
The Notes section for
describe
states the following (emphasis mine):Since pandas 2.0 began treating Timestamps as numeric data, as far as I can tell, calling
describe
on a Series/DF with Timestamp data no longer yields thefirst
orlast
rows. In fact, the example included in the documentation also has this behavior:Suggested fix for documentation
Assuming this behavior is intended: remove mention of the
first
andlast
columns, and of timestamps as object data.The text was updated successfully, but these errors were encountered: