-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Rename get_dummies to more inclusive language #48250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey @davidcavazos I'd also be favourable to renaming it, but this was discussed in #35724 and at the time rejected
|
Closing for now then, if the world does converge on an alternative which would allow for #35724 to be reconsidered then we can go with that |
The world won't converge on an alternative if we don't start doing the change. GitHub's |
My takeaway from the previous discussion is that adding a separate
It seems the former was rejected, but the latter could be acceptable. Could we pursue that approach? (Also, would you prefer for us to continue discussion in #35274?) |
A non-Google reference for "dummy" being non-inclusive: https://itconnect.uw.edu/guides-by-topic/identity-diversity-inclusion//inclusive-language-guide/
|
That's a nice reference, thanks!
Yes please, let's keep the discussion in one place - perhaps post the reference you linked above there? |
Thanks @TheNeuralBit. I think that either I would personally create the new name and mark |
The University of Delaware also has a similar (although shorter) list including to remove "dummy value": |
And here are some more examples: And even the National Institute of Standards and Technology |
This document also mentions how it causes harm:
|
Thanks @davidcavazos , appreciate the references - could you post them in #35724 please so we keep the discussion in one place? |
Sure, I just posted a summary of the key points. |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
The word "dummy" from the
pd.get_dummies
function can be offensive to some people and should be renamed.It's marked as a word that should not be used by Google's inclusive language word list.
Feature Description
A good alternative name could be renaming it to
pd.get_indicator_variables
, which would also be more explicit on what it does.Alternative Solutions
Alternatively,
pd.get_one_hot
orpd.get_one_hot_encoded
could also be an option familiar to Machine Learning practitioners.Google trends show "indicator variable" and "one-hot encoding" to be similarly popular, with "indicator variable" being slightly more popular.
Additional Context
No response
The text was updated successfully, but these errors were encountered: