-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC iteritems docstring update and examples #22658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
0521552
5447a1f
8ccd554
fcc27e8
6dad21c
25da7f8
30026a4
5110b7c
76243b3
1b52a08
618318c
d8e5370
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -778,14 +778,55 @@ def style(self): | |
return Styler(self) | ||
|
||
def iteritems(self): | ||
""" | ||
r""" | ||
Iterator over (column name, Series) pairs. | ||
|
||
See also | ||
Iterates over the DataFrame columns, returning a tuple with the column name and the content as a Series. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this doesn't fit within 80 chars line length limit (PEP8), I assume the CI is failing due to that. There are also some other lines below that are too long. I would recommend to activate a flake8 / linter plugin in your editor, or run |
||
|
||
Yields | ||
------ | ||
label : object | ||
The column names for the DataFrame being iterated over. | ||
content : Series | ||
The column entries belonging to each label, as a Series. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @datapythonista I think in this case, the above is actually a bit confusing. Typically, we use the formatting above if there are actually two return values (so if you could do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks clear to me, not sure why it doesn't to you. In this case you can do Not a big deal changing this to a What do you think is clearer for you @Ecboxer? Also, may be @WillAyd want to give an opinion, and he's doing a lot with the docstrings? Happy with whatever option is clearer to most people. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me know what you think of the rephrased it under Yields. It may be too wordy? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The thing is that otherwise we are using the same visual formatting to mean two different things. I would prefer that a user can know from the return type if there is a single or multiple return values (but maybe I am overestimating our users?) We can maybe still combine both, something like:
or does that only make it more complicated? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah sorry, I missed that it was a "Yields" section, and not a "Returns" section. In that case, it is correct that it yields two values in each iteration! (and how you did it here is consistent with the numpydoc guidelines) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Ecboxer sorry, you can change it back to how it was before I commented :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hehe, I see what you meant now. Cool then. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed it back :) |
||
|
||
See Also | ||
-------- | ||
iterrows : Iterate over DataFrame rows as (index, Series) pairs. | ||
itertuples : Iterate over DataFrame rows as namedtuples of the values. | ||
DataFrame.iterrows : Iterate over DataFrame rows as (index, Series) pairs. | ||
DataFrame.itertuples : Iterate over DataFrame rows as namedtuples of the values. | ||
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame({'species': ['bear', 'bear', 'bear', 'bear', 'marsupial'], | ||
... 'population': [300000, 200000, 1864, 22000, 80000]}, | ||
... index=['black', 'brown', 'panda', 'polar', 'koala']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would make it a little bit shorter (eg 3 rows seems enough), that will make the output below more clear I think |
||
>>> df | ||
species population | ||
black bear 300000 | ||
brown bear 200000 | ||
panda bear 1864 | ||
polar bear 22000 | ||
koala marsupial 80000 | ||
>>> for label, content in df.iteritems(): | ||
... print('label:', label) | ||
... print('content:', content, sep='\n') | ||
... | ||
label: species | ||
content: | ||
black bear | ||
brown bear | ||
panda bear | ||
polar bear | ||
koala marsupial | ||
Name: species, dtype: object | ||
label: population | ||
content: | ||
black 300000 | ||
brown 200000 | ||
panda 1864 | ||
polar 22000 | ||
koala 80000 | ||
Name: population, dtype: int64 | ||
""" | ||
if self.columns.is_unique and hasattr(self, '_item_cache'): | ||
for k in self.columns: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this returns an
Iterator
(or more specifically aGenerator
). So I would leave this asIterator
or change this toGenerator
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought to make the change to 'Iterate over DataFrame ... as (..., Series) pairs to stay within the style of the iterrows and itertuples functions, but I can revert back to 'Iterator over (column name, Series) pairs.