FIX value_counts should skip NaT #7424

hayd · 2014-06-11T00:00:33Z

hayd · 2014-06-11T00:18:38Z

Hmm this seems to be explicitly tested for in test_base and test_timeseries. I still think correct behaviour is to drop the NaTs...

sinhrks · 2014-06-11T11:57:49Z

The test was added based on the discussion in #6734. I agree it will be clearer if NaT is treated as the same as nan.

jreback · 2014-06-11T12:36:33Z

pandas/core/algorithms.py

        values = values.view(np.int64)
        keys, counts = htable.value_count_int64(values)

+        from pandas.lib import NaT
+        msk = keys != NaT.value


you should be comparing vs iNaT no?

What's iNaT? (NaT.value == -9223372036854775808).

its the same, but just use: pandas.tslib.iNaT is the convention

jreback · 2014-06-11T12:57:58Z

so this makes this consistent with nan (which is excluded), separately, maybe have an option to value_counts to include nan/NaT?

hayd · 2014-06-11T16:46:05Z

@jreback yes, shouldn't be too trick to put something together. I was thinking the same when I was doing this... of some analogous include_na (maybe get_dummies... called dummy_na)... I suppose dropna is more standard kwarg.

jreback · 2014-06-11T16:48:29Z

yes, dropna=True (as the default) is prob correct. Pls add this. I think we might want to change the default to dropna=False in 0.15.0 but that is a separate issue.

jreback · 2014-06-14T13:13:23Z

@hayd looks good...squash and merge

FIX value_counts should skip NaT

jreback · 2014-06-17T12:01:41Z

thanks!

jorisvandenbossche · 2014-06-17T12:51:17Z

pandas/core/algorithms.py

@@ -184,6 +184,8 @@ def value_counts(values, sort=True, ascending=False, normalize=False,
    bins : integer, optional
        Rather than count values, group them into half-open bins,
        convenience for pd.cut, only works with numeric data
+    dropna : boolean, default False


I think this should be True?

jorisvandenbossche · 2014-06-17T12:55:33Z

@hayd some comments, but will fix it in a PR

hayd · 2014-06-17T15:21:32Z

@jorisvandenbossche thanks! good points!

FIX value_counts should skip NaT pandas-dev#7423

458dc94

jreback reviewed Jun 11, 2014
View reviewed changes

jreback added Bug labels Jun 11, 2014

jreback added this to the 0.14.1 milestone Jun 11, 2014

jreback added Timedelta and removed Numeric labels Jun 11, 2014

hayd mentioned this pull request Jun 13, 2014

Feature request: Option to include NaNs in value_counts() #5569

Closed

hayd added 2 commits June 13, 2014 13:16

ENH dropna for value counts, fix tests for NaT

f4bb5d7

DOC add value_counts dropna to release notes

26139f3

jreback added a commit that referenced this pull request Jun 17, 2014

Merge pull request #7424 from hayd/value_counts_NaT

faf471d

FIX value_counts should skip NaT

jreback merged commit faf471d into pandas-dev:master Jun 17, 2014

jorisvandenbossche reviewed Jun 17, 2014
View reviewed changes

jorisvandenbossche mentioned this pull request Jun 17, 2014

DOC: fix docstring of value_counts/nunique dropna argument after GH7424 #7483

Merged

hayd deleted the value_counts_NaT branch June 17, 2014 15:21

sinhrks mentioned this pull request Jul 17, 2014

nunique is slower than len(set(x.dropna())) for smaller Series. #7771

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX value_counts should skip NaT #7424

FIX value_counts should skip NaT #7424

Uh oh!

hayd commented Jun 11, 2014

Uh oh!

hayd commented Jun 11, 2014

Uh oh!

sinhrks commented Jun 11, 2014

Uh oh!

jreback Jun 11, 2014

Uh oh!

hayd Jun 13, 2014

Uh oh!

jreback Jun 13, 2014

Uh oh!

jreback commented Jun 11, 2014

Uh oh!

hayd commented Jun 11, 2014

Uh oh!

jreback commented Jun 11, 2014

Uh oh!

jreback commented Jun 14, 2014

Uh oh!

jreback commented Jun 17, 2014

Uh oh!

jorisvandenbossche Jun 17, 2014

Uh oh!

jorisvandenbossche commented Jun 17, 2014

Uh oh!

hayd commented Jun 17, 2014

Uh oh!

Uh oh!

Uh oh!

FIX value_counts should skip NaT #7424

FIX value_counts should skip NaT #7424

Uh oh!

Conversation

hayd commented Jun 11, 2014

Uh oh!

hayd commented Jun 11, 2014

Uh oh!

sinhrks commented Jun 11, 2014

Uh oh!

jreback Jun 11, 2014

Choose a reason for hiding this comment

Uh oh!

hayd Jun 13, 2014

Choose a reason for hiding this comment

Uh oh!

jreback Jun 13, 2014

Choose a reason for hiding this comment

Uh oh!

jreback commented Jun 11, 2014

Uh oh!

hayd commented Jun 11, 2014

Uh oh!

jreback commented Jun 11, 2014

Uh oh!

jreback commented Jun 14, 2014

Uh oh!

jreback commented Jun 17, 2014

Uh oh!

jorisvandenbossche Jun 17, 2014

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche commented Jun 17, 2014

Uh oh!

hayd commented Jun 17, 2014

Uh oh!

Uh oh!