Skip to content

DOC: Set value for undefined variables in examples #51342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
datapythonista opened this issue Feb 12, 2023 · 14 comments · Fixed by #51389
Closed

DOC: Set value for undefined variables in examples #51342

datapythonista opened this issue Feb 12, 2023 · 14 comments · Fixed by #51389

Comments

@datapythonista
Copy link
Member

xref #22900

Ideally we'd like to make all the examples in our documentation runnable. Seems like in some cases we're using variables without defining them. We should fix them if they are typos, or unimported modules (except np and pd), or we should provide a value if it's just a variable we didn't define.

We can get the list of problems with the next command:

$ flake8 --doctest --ignore="" --select=F821 pandas/ | grep -v "F821 undefined name 'pd'" | grep -v "F821 undefined name 'np'"
pandas/core/apply.py:1339:9: F821 undefined name '_relabel_result'
pandas/core/apply.py:1339:33: F821 undefined name 'func'
pandas/core/arrays/datetimelike.py:262:13: F821 undefined name 'self'
pandas/core/dtypes/base.py:266:27: F821 undefined name 're'
pandas/core/generic.py:2247:17: F821 undefined name 'df'
pandas/core/generic.py:5890:13: F821 undefined name 'func'
pandas/core/generic.py:5890:18: F821 undefined name 'g'
pandas/core/generic.py:5890:20: F821 undefined name 'h'
pandas/core/generic.py:5890:22: F821 undefined name 'df'
pandas/core/generic.py:5890:32: F821 undefined name 'a'
pandas/core/generic.py:5890:41: F821 undefined name 'b'
pandas/core/generic.py:5890:49: F821 undefined name 'c'
pandas/core/generic.py:5894:14: F821 undefined name 'df'
pandas/core/generic.py:5894:22: F821 undefined name 'h'
pandas/core/generic.py:5895:22: F821 undefined name 'g'
pandas/core/generic.py:5895:30: F821 undefined name 'a'
pandas/core/generic.py:5896:22: F821 undefined name 'func'
pandas/core/generic.py:5896:33: F821 undefined name 'b'
pandas/core/generic.py:5896:41: F821 undefined name 'c'
pandas/core/generic.py:5903:14: F821 undefined name 'df'
pandas/core/generic.py:5903:22: F821 undefined name 'h'
pandas/core/generic.py:5904:22: F821 undefined name 'g'
pandas/core/generic.py:5904:30: F821 undefined name 'a'
pandas/core/generic.py:5905:23: F821 undefined name 'func'
pandas/core/generic.py:5905:43: F821 undefined name 'a'
pandas/core/generic.py:5905:51: F821 undefined name 'c'
pandas/io/formats/style.py:1873:25: F821 undefined name 'ret'
pandas/io/formats/style_render.py:1279:49: F821 undefined name 'upper'
@kkangs0226
Copy link
Contributor

take

@MarcoGorelli
Copy link
Member

Hey

Wouldn't fixing EX02 also end up fixing these?

@datapythonista
Copy link
Member Author

Yes, good point, thanks @MarcoGorelli

I'd say that fixing the errors here will also help fix the docstring validations EX02 and EX03. I created another issue with the remaining errors in the examples. I think we can add the validation once both are completed.

@kkangs0226 I created two issues so two people could work in parallel, since these issues aren't trivial and will take some time. Please don't assign two issues to yourself at the same time, so other people can work on the one you aren't working yet.do you mind releasing the one you aren't workimg in yet? Thank you!

@kkangs0226
Copy link
Contributor

@datapythonista Got it, I was looking at both issues because they looked related. I will release this issue for other developers to work on it!

@datapythonista
Copy link
Member Author

@datapythonista Got it, I was looking at both issues because they looked related. I will release this issue for other developers to work on it!

No problem at all. And surely feel free to work in both, but I think they' may take longer than what it seems, so better to start with one leaving the other free in case someone else is available to work on it in the meantime. Thanks for helping with this!

@anshuman0123
Copy link

assign me please . I am new to github

@DeaMariaLeon
Copy link
Member

Ideally we'd like to make all the examples in our documentation runnable.

@datapythonista almost all the examples are runnable, as we have been working a lot to remove EX02 errors. There are only about 13 functions that still need fixing.
As flake8 errors go, I think we have to avoid the extra parentheses that have been added to a few functions.
On a side note, flake8 throws "PDF023 found assignment to single-letter variable". But the documentation guidelines tell us to use single-letter variables.

@MarcoGorelli
Copy link
Member

that's a good point - the sentence

When illustrating examples with a single Series use the name s

should probably be updated to

When illustrating examples with a single Series use the name ser

Regarding the "PDF023" - this one is from a flake8 extension which we're no longer using, having moved to ruff - to remove, you could either recreate your environment, or just do

pip uninstall pandas-dev-flaker -y

(or the mamba equivalent)

@DeaMariaLeon
Copy link
Member

@MarcoGorelli just remember that the letter "s" is all over the place.. I don't know how many examples there are with this, but it seems like a colosal change.
Also, if we remove the PDF023, does it mean we don't need to worry about this error, or does ruffcomplain also about it?

@MarcoGorelli
Copy link
Member

sure, maybe it's too large to change everywhere, but for new docstrings we could change the advice

and no, ruff doesn't complain about it (though technically flake8 is still used for docstrings, but without extra plugins installed)

@DeaMariaLeon
Copy link
Member

DeaMariaLeon commented Feb 26, 2023

Updated: Numpy, scikitlearn etc also have single letter names in the docstrings..

Also, I don’t know where on github to keep talking about this, as the same issue was opened several times.
With the PR #51634 to remove the parentheses, I only modified one of the functions of the original PR #51356.
When I locally test the other functions that were modified (with #51356), there are still flake8 errors. Plus they also have the extra parentheses - that could confuse beginners.

I understand that just one type of flake8 error needed to be fixed, as instructed in the original issue. But since we are not testing EX03 yet, the rest of flake8 errors didn't come up I think.

I’m attaching the errors I get. I'm not sure if I should open an issue.

PR51356 errors.txt

@MarcoGorelli
Copy link
Member

thanks for flagging this - I think #51356 was an improvement because it took us from invalid syntax such as

df.style.format(na_rep='MISS', precision=1, subset=[0])
    .format(na_rep='PASS', precision=2, subset=[1, 2])

to valid syntax such as

(df.style.format(na_rep='MISS', precision=1, subset=[0])
    .format(na_rep='PASS', precision=2, subset=[1, 2]))

Maybe these can be made more legibile, you're right, e.g.

df.style.format(na_rep="MISS", precision=1, subset=[0]).format(
    na_rep="PASS", precision=2, subset=[1, 2]
)

, if you wanted to open an issue about that (or directly make a PR), that'd be welcome

And if you wanted to open an issue about EX03 (similar to the current EX02/EX01 ones, which are gradually getting addressed - nice 🎉 ), that would also be welcome

@DeaMariaLeon
Copy link
Member

I will. Thanks for the explanations.

@MarcoGorelli
Copy link
Member

I think Marc makes a good point here that's it's not really worth changing

In general for these kinds of changes, it's probably more sustainable to either make changes as suggested by some automated tool, or to leave them, else it's a matter of taste and we end up ping-ponging between preferences

Thanks anyway for your comments and for engaging in PR review though, much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants