Skip to content

CI,STYLE: GH38802 narrow down list of ignored words #38820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

CI,STYLE: GH38802 narrow down list of ignored words #38820

wants to merge 2 commits into from

Conversation

chinggg
Copy link
Contributor

@chinggg chinggg commented Dec 30, 2020

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chinggg , this looks correct!

Regarding your comment in the closed PR, git is hard for everyone at the beginning - I found the first 3 chapters of this book to be invaluable in learning it https://git-scm.com/book/en/v2


Will merge when this is green, if you want to try similarly narrowing down other words, then that might well be possible

--

For the future, I'd advise creating a new branch when opening a PR, see the contributing guide https://pandas.pydata.org/pandas-docs/dev/development/contributing.html

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to go back on this, but I think we can do better

The changes you've made for datas are fine, but it might be better to reverse the ones for fox/nox, as fox as at least is a real animal.

The way to do that is:

  1. add a new file called .codespellignorelines
  2. in that file, add the following:
        1    fo
  1. in setup.cfg, add the following to the codespell section so it looks like this:
[codespell]
ignore-words-list=ba,blocs,coo,hist,nd,ser
exclude-file=.codespellignorelines

Like this, it should be possible to get rid of some other words to ignore, such as ba. nd and hist are pretty common, and make sense to keep, but those like ba that just appear sporadically we could add to the .codespellignorelines file


Do let me know if any of this is unclear or if you want/need help with git commands

@jreback jreback added the Code Style Code style, linting, code_checks label Dec 30, 2020
@chinggg
Copy link
Contributor Author

chinggg commented Dec 31, 2020

Thanks a lot for explaining patiently!
I hava realized the general working mechanism of codespell. That is very cool and must be useful in large projects like pandas. We should consider how frequently and commonly the words are used by programmers around the world, and only add those widely-accpeted acronyms or abbreviations to the ignore-words-list. Those temporary symbols should be put in the .codespellignorelines file instead.
Because I do not live in an english-speaking country, I am not sure about the frequency of these shorten words. So I use codespell to check where and how they show up in the code.
I found that

  • blocs may be a common abbr. for blocks
  • coo seems to be a mathematical term
  • nd stands for N-dimensional
  • ser is short for series
  • hist only has 4 results but it is also used by matplotlib so I consider it as a common practice.
    So I decide to remove only ba and datas(already done by your instruction to rename datas to data).
    I make a new branch named codespell-ignore. Because the branch of this PR is master and seems to be unchangeable, so I have to make a new PR again. I hope it will not be improper.

@MarcoGorelli
Copy link
Member

Sure, that's fine, thanks!

And yes, I agree with your assessment of which words to keep in ignore-words-list

@MarcoGorelli
Copy link
Member

I have to make a new PR again. I hope it will not be improper.

Closing this one in favour of the new PR then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants