Skip to content

value_counts() can now compute relative frequencies. #2710

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

lexual
Copy link
Contributor

@lexual lexual commented Jan 19, 2013

No description provided.

@wesm
Copy link
Member

wesm commented Jan 19, 2013

Can you please add a test case?

@lexual
Copy link
Contributor Author

lexual commented Jan 20, 2013

Test case now provided ;)

@wesm
Copy link
Member

wesm commented Jan 22, 2013

Deferring til next release. Need to think about whether relative or normalize is the right argument name (should consult other sources for guidance)

@hayd
Copy link
Contributor

hayd commented Feb 10, 2013

I think normalize is more appropriate here. An individual frequency may be relative (hence the phrase "Relative frequency histogram"), but the overall is normalized. [citation needed]

@lexual
Copy link
Contributor Author

lexual commented Feb 10, 2013

Here's what matplotlib & numpy do:

http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist
has a "normed" argument.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html
has a "normed" argument.
apparently this is being deprecated for numpy 2.0 and will be using the "density" argument instead.

Both of the above links are worth reading for further clarification of how they handle this.

@hayd
Copy link
Contributor

hayd commented Feb 11, 2013

It seems a bit weird to talk about "density" with the discrete-sounding value_ counts, but I can see how it could make other things consistent... density would certainly be a good addition when bucketing.

normed is better than normalize (and has the same meaning)

@ghost
Copy link

ghost commented Mar 14, 2013

Barring loud outcry, I plan to merge this using "normalize" as the keyword very soon.

@wesm
Copy link
Member

wesm commented Mar 14, 2013

+1

@ghost
Copy link

ghost commented Mar 14, 2013

Thanks @lexual, merged as b143a21. Apologies for the delay.

Saw your django repo - good feature. Suggest you see if there's common ground with #2717.

@ghost ghost closed this Mar 14, 2013
@lexual
Copy link
Contributor Author

lexual commented Mar 14, 2013

Thanks @y-p .

Do you have an opinion on whether something like https://github.com/lexual/pandas-love-ponies should live outside Pandas or inside Pandas? I never got any responses to my post to the mailing list a month ago.

@ghost
Copy link

ghost commented Mar 14, 2013

My opinion is that pandas is in constant feature-creep japordy, and this
is not close to home. However, playing nice with SQL, especially under
a unified interface through SQL alchemy would be a win, so if django
fits in there as a "flavor", 👍 , else, probably not core material.

others may disagree.

@lexual
Copy link
Contributor Author

lexual commented Mar 14, 2013

My code is completely Django specific so imagine it would be separate to sql alchemy work (although I haven't had a close look at the sql alchemy stuff). So I imagine you're probably right that it should live outside of Pandas unless it starts getting widespread use.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants