-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add simple test for GH 28448 #29269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add simple test for GH 28448 #29269
Conversation
], | ||
) | ||
def test_consistent_casting(self, data, dtype, expected): | ||
# GH 28448 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OP uses asarray
which is different than what is being tested here. Can you not do asarray
in the test? The return type should be a date time array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, pandas.arrays.DatetimeArray
? pd.Categorical(data).astype(dtype)
is only of that type in the second case (when dtype
is datetime64[ns, MET]
).
In the first case (when dtype
is datetime64[ns]
), it is of type np.array
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm well I think that is a problem that as type would return a numpy array for "datetime64[ns]" and a DatetimeArray for "datetime64[ns, ]" but @jbrockmendel or @TomAugspurger might know more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't totally follow the discussion but it seems like #28448 is about Categorical.astype(datetime64[ns, tz])
, which return a DatatimeArray.
So I think the test should be something like
result = pd.Categorical(data).astype(dtype)
tm.assert_equal(result, expected)
no asarray.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right I think so too. But the problem is Categorical.astype("datetime64[ns]")
returns a NumPy array, so tz-naive date times would be a numpy array where tz-aware would be a DTA.
That seems odd to me but I don't know the entire history of these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory Categorical.astype('datetime64[ns]')
could return a tz-naive DatetimeArray. I'm not sure if it was discussed explicitly when DatetimeArray was added, but changing that would be API-breaking and IMO not worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillAyd I tend to agree with everyone above in that this is sub-optimal behavior but not worth changing on its own*. Is this a problem or A Problem?
* There's been some discussion of making EA.astype always return EA at some point, which I think would include this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you guys don't think so then no. I just wasn't aware of the historical context hence the ping - thanks for the insights.
So @MarcoGorelli still should update the tests to use asarray
but vary expectation accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks all for your feedback - I've updated the tests
"data, dtype, expected", | ||
[ | ||
( | ||
"2015-01-01", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note - since data
is always the same no need to parametrize
… types (datetimearray and np.array) are used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @TomAugspurger or @jbrockmendel feel free to merge if it looks good to you as well
Won't have a chance to look at this anytime soon, so don't wait for me.
…On Wed, Oct 30, 2019 at 12:12 PM William Ayd ***@***.***> wrote:
***@***.**** approved this pull request.
lgtm @TomAugspurger <https://github.com/TomAugspurger> or @jbrockmendel
<https://github.com/jbrockmendel> feel free to merge if it looks good to
you as well
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29269?email_source=notifications&email_token=AAKAOISNDNBFE2PODAJQIXLQRG6AJA5CNFSM4JGLPMHKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCJYLSNY#pullrequestreview-309377335>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIWIA6W6KI2GKWEN4WDQRG6AJANCNFSM4JGLPMHA>
.
|
( | ||
"datetime64[ns, MET]", | ||
pd.arrays.DatetimeArray( | ||
pd.array([pd.Timestamp("2015-01-01 00:00:00+0100", tz="MET")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pd.DatetimeIndex([pd.Timestamp("2015-01-01 00:00:00+0100", tz="MET")]).array
would be more succint. would it be any less clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's nicer, thanks for the suggestion - I've made the change
small comment, not a blocker. lgtm |
thanks @MarcoGorelli |
Categorical
to 'datetime64[ns, MET]', but can do it with Series #28448black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff