-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DTI.value_counts doesnt preserve tz #7735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -186,6 +186,8 @@ def value_counts(values, sort=True, ascending=False, normalize=False, | |||
convenience for pd.cut, only works with numeric data | |||
dropna : boolean, default True | |||
Don't include counts of NaN | |||
has_inat : boolean, default False | |||
Handle int dtype as datetime representation and remove iNaT using dropna |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem necessary and is very confusing, just use dropna indicator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK to always drop iNaT
even if input is normal int64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u know the dtype of the index so u can so this only if ts a date like (period or DatetimeIndex)
eg look in the very next elif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and u can simply mask the results (if dropna) then box
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, how about this?
better |
rebase this |
@sinhrks pls rebase |
@jreback Rebased. |
can you run a perf test on this? |
Attached.
|
yep, agreed. ping when green |
@jreback Now green |
@@ -544,3 +544,20 @@ def _add_delta(self, other): | |||
return NotImplemented | |||
|
|||
|
|||
def value_counts(self, normalize=False, sort=True, ascending=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't want to combine with the regular value_counts (in the OpsMixin)? (too hard?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible. Actually current logic has if-else clause, not much difference putting it to IndexOpsMixin...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k...just trying to avoid dup code as much as possible...lmk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, fixed. Pls check.
perfect, ping when green |
BUG: DTI.value_counts doesnt preserve tz
Found 2 problems related to
value_counts
.DatetimeIndex.value_counts
loses tz.PeriodIndex.value_counts
results inInt64Index
, and unable to dropNaT
.