Skip to content

DOC: sample codes should not override Python built-in functions #47606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
wany-oh opened this issue Jul 6, 2022 · 5 comments · Fixed by #47631
Closed
1 task done

DOC: sample codes should not override Python built-in functions #47606

wany-oh opened this issue Jul 6, 2022 · 5 comments · Fixed by #47631
Labels

Comments

@wany-oh
Copy link
Contributor

wany-oh commented Jul 6, 2022

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

For example, https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#transformation

Alternatively, the built-in methods could be used to produce the same outputs.

max = ts.groupby(lambda x: x.year).transform("max")

min = ts.groupby(lambda x: x.year).transform("min")

max - min

Documentation problem

It is not good that the user guide lists codes that override Python built-in functions.

This is the only part I happened to find, but others may exist in doc. If so, its should be revised as well.

Suggested fix for documentation

In this case, max/min should rename like max_ts/min_ts.

@wany-oh wany-oh added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 6, 2022
@mroeschke
Copy link
Member

Thanks for the report. Happy to have a pull request to adjust the docs here

@mroeschke mroeschke removed the Needs Triage Issue that has not been reviewed by a pandas team member label Jul 6, 2022
@Transurgeon
Copy link

Sorry, for the duplicate pull request. I didn't know @wany-oh would take care of it, I just did it in case..

@Transurgeon
Copy link

I just added another commit.

I made some changes relating to this issue with example code overriding built-in functions/parameters.

In this case, using a variable called "columns" for the type and name of each column can create confusion with Dataframe Columns.

I learned that this was also known as a "schema" so I removed a few instances of this issue to rename it to "column_schema".

I used Linux grep to find all instances of this issue, and manually went through all of them to make sure to differentiate between the actual Dataframe.columns = "[foo]" and column = "[foo]"

You will notice that all my changes have two small parts :

  1. I changed the variable declaration
  2. I changed the assignment of the variable from "columns(df)=columns" to "columns=column_schema"

Please let me know if this is ok.

@wany-oh
Copy link
Contributor Author

wany-oh commented Jul 8, 2022

Sorry, for the duplicate pull request. I didn't know @wany-oh would take care of it, I just did it in case..

It was almost at the same time. Thank you for the follow up.

@phofl
Copy link
Member

phofl commented Jul 8, 2022

I disagree with that. Having a variable named the same as the argument name associates them with each other in my opinion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants