value_counts() can now compute relative frequencies. #2710

lexual · 2013-01-19T23:49:17Z

No description provided.

wesm · 2013-01-19T23:54:18Z

Can you please add a test case?

lexual · 2013-01-20T00:55:10Z

Test case now provided ;)

wesm · 2013-01-22T05:02:24Z

Deferring til next release. Need to think about whether relative or normalize is the right argument name (should consult other sources for guidance)

hayd · 2013-02-10T22:34:04Z

I think normalize is more appropriate here. An individual frequency may be relative (hence the phrase "Relative frequency histogram"), but the overall is normalized. [citation needed]

lexual · 2013-02-10T22:55:25Z

Here's what matplotlib & numpy do:

http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist
has a "normed" argument.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html
has a "normed" argument.
apparently this is being deprecated for numpy 2.0 and will be using the "density" argument instead.

Both of the above links are worth reading for further clarification of how they handle this.

hayd · 2013-02-11T12:33:28Z

It seems a bit weird to talk about "density" with the discrete-sounding value_ counts, but I can see how it could make other things consistent... density would certainly be a good addition when bucketing.

normed is better than normalize (and has the same meaning)

ghost · 2013-03-14T01:49:30Z

Barring loud outcry, I plan to merge this using "normalize" as the keyword very soon.

wesm · 2013-03-14T01:51:47Z

+1

ghost · 2013-03-14T03:57:30Z

Thanks @lexual, merged as b143a21. Apologies for the delay.

Saw your django repo - good feature. Suggest you see if there's common ground with #2717.

lexual · 2013-03-14T04:13:29Z

Thanks @y-p .

Do you have an opinion on whether something like https://github.com/lexual/pandas-love-ponies should live outside Pandas or inside Pandas? I never got any responses to my post to the mailing list a month ago.

ghost · 2013-03-14T04:28:30Z

My opinion is that pandas is in constant feature-creep japordy, and this
is not close to home. However, playing nice with SQL, especially under
a unified interface through SQL alchemy would be a win, so if django
fits in there as a "flavor", 👍 , else, probably not core material.

others may disagree.

lexual · 2013-03-14T04:35:26Z

My code is completely Django specific so imagine it would be separate to sql alchemy work (although I haven't had a close look at the sql alchemy stuff). So I imagine you're probably right that it should live outside of Pandas unless it starts getting widespread use.

value_counts() can now compute relative frequencies.

8cb32a7

ghost closed this Mar 14, 2013

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

value_counts() can now compute relative frequencies. #2710

value_counts() can now compute relative frequencies. #2710

lexual commented Jan 19, 2013

wesm commented Jan 19, 2013

lexual commented Jan 20, 2013

wesm commented Jan 22, 2013

hayd commented Feb 10, 2013

lexual commented Feb 10, 2013

hayd commented Feb 11, 2013

ghost commented Mar 14, 2013

wesm commented Mar 14, 2013

ghost commented Mar 14, 2013

lexual commented Mar 14, 2013

ghost commented Mar 14, 2013

lexual commented Mar 14, 2013

value_counts() can now compute relative frequencies. #2710

value_counts() can now compute relative frequencies. #2710

Conversation

lexual commented Jan 19, 2013

wesm commented Jan 19, 2013

lexual commented Jan 20, 2013

wesm commented Jan 22, 2013

hayd commented Feb 10, 2013

lexual commented Feb 10, 2013

hayd commented Feb 11, 2013

ghost commented Mar 14, 2013

wesm commented Mar 14, 2013

ghost commented Mar 14, 2013

lexual commented Mar 14, 2013

ghost commented Mar 14, 2013

lexual commented Mar 14, 2013