Skip to content

Commit 0dc3c18

Browse files
author
Marc Garcia
committed
Updating docstring guide documentation with last changes
1 parent 2bf537a commit 0dc3c18

File tree

1 file changed

+125
-57
lines changed

1 file changed

+125
-57
lines changed

pandas/guide/source/pandas_docstring.rst

+125-57
Original file line numberDiff line numberDiff line change
@@ -91,16 +91,17 @@ General rules
9191
~~~~~~~~~~~~~
9292

9393
Docstrings must be defined with three double-quotes. No blank lines should be
94-
left before or after the docstring. The text starts immediately after the
95-
opening quotes (not in the next line). The closing quotes have their own line
94+
left before or after the docstring. The text starts in the next line after the
95+
opening quotes. The closing quotes have their own line
9696
(meaning that they are not at the end of the last sentence).
9797

9898
**Good:**
9999

100100
.. code-block:: python
101101
102102
def func():
103-
"""Some function.
103+
"""
104+
Some function.
104105
105106
With a good docstring.
106107
"""
@@ -114,19 +115,18 @@ opening quotes (not in the next line). The closing quotes have their own line
114115
115116
def func():
116117
117-
"""
118-
Some function.
118+
"""Some function.
119119
120120
With several mistakes in the docstring.
121-
121+
122122
It has a blank like after the signature `def func():`.
123-
124-
The text 'Some function' should go in the same line as the
125-
opening quotes of the docstring, not in the next line.
126-
123+
124+
The text 'Some function' should go in the next line then the
125+
opening quotes of the docstring, not in the same line.
126+
127127
There is a blank line between the docstring and the first line
128128
of code `foo = 1`.
129-
129+
130130
The closing quotes should be in the next line, not in this one."""
131131
132132
foo = 1
@@ -150,7 +150,8 @@ details.
150150
.. code-block:: python
151151
152152
def astype(dtype):
153-
"""Cast Series type.
153+
"""
154+
Cast Series type.
154155
155156
This section will provide further details.
156157
"""
@@ -161,28 +162,32 @@ details.
161162
.. code-block:: python
162163
163164
def astype(dtype):
164-
"""Casts Series type.
165+
"""
166+
Casts Series type.
165167
166168
Verb in third-person of the present simple, should be infinitive.
167169
"""
168170
pass
169171
170172
def astype(dtype):
171-
"""Method to cast Series type.
173+
"""
174+
Method to cast Series type.
172175
173176
Does not start with verb.
174177
"""
175178
pass
176179
177180
def astype(dtype):
178-
"""Cast Series type
181+
"""
182+
Cast Series type
179183
180184
Missing dot at the end.
181185
"""
182186
pass
183187
184188
def astype(dtype):
185-
"""Cast Series type from its current type to the new type defined in
189+
"""
190+
Cast Series type from its current type to the new type defined in
186191
the parameter dtype.
187192
188193
Summary is too verbose and doesn't fit in a single line.
@@ -201,10 +206,14 @@ go in other sections.
201206
A blank line is left between the short summary and the extended summary. And
202207
every paragraph in the extended summary is finished by a dot.
203208

209+
The extended summary should provide details on why the function is useful and
210+
their use cases, if it is not too generic.
211+
204212
.. code-block:: python
205213
206214
def unstack():
207-
"""Pivot a row index to columns.
215+
"""
216+
Pivot a row index to columns.
208217
209218
When using a multi-index, a level can be pivoted so each value in
210219
the index becomes a column. This is especially useful when a subindex
@@ -239,19 +248,28 @@ can have multiple lines. The description must start with a capital letter, and
239248
finish with a dot.
240249

241250
Keyword arguments with a default value, the default will be listed in brackets
242-
at the end of the description (before the dot). The exact form of the
243-
description in this case would be "Description of the arg (default is X).". In
244-
some cases it may be useful to explain what the default argument means, which
245-
can be added after a comma "Description of the arg (default is -1, which means
246-
all cpus).".
251+
at the end of the type. The exact form of the type in this case would be for
252+
example "int (default is 0)". In some cases it may be useful to explain what
253+
the default argument means, which can be added after a comma "int (default is
254+
-1, which means all cpus)".
255+
256+
In cases where the default value is `None`, meaning that the value will not be
257+
used, instead of "str (default is None)" it is preferred to use "str, optional".
258+
When `None` is a value being used, we will keep the form "str (default None).
259+
For example consider `.fillna(value=None)`, in which `None` is the value being
260+
used to replace missing values. This is different from
261+
`.to_csv(compression=None)`, where `None` is not a value being used, but means
262+
that compression is optional, and will not be used, unless a compression type
263+
is provided. In this case we will use `str, optional`.
247264

248265
**Good:**
249266

250267
.. code-block:: python
251268
252269
class Series:
253270
def plot(self, kind, color='blue', **kwargs):
254-
"""Generate a plot.
271+
"""
272+
Generate a plot.
255273
256274
Render the data in the Series as a matplotlib plot of the
257275
specified kind.
@@ -260,8 +278,8 @@ all cpus).".
260278
----------
261279
kind : str
262280
Kind of matplotlib plot.
263-
color : str
264-
Color name or rgb code (default is 'blue').
281+
color : str (default 'blue')
282+
Color name or rgb code.
265283
**kwargs
266284
These parameters will be passed to the matplotlib plotting
267285
function.
@@ -274,7 +292,8 @@ all cpus).".
274292
275293
class Series:
276294
def plot(self, kind, **kwargs):
277-
"""Generate a plot.
295+
"""
296+
Generate a plot.
278297
279298
Render the data in the Series as a matplotlib plot of the
280299
specified kind.
@@ -302,26 +321,33 @@ Parameter types
302321
^^^^^^^^^^^^^^^
303322

304323
When specifying the parameter types, Python built-in data types can be used
305-
directly:
324+
directly (the Python type is preferred to the more verbose string, integer,
325+
boolean, etc):
306326

307327
- int
308328
- float
309329
- str
310330
- bool
311331

312-
For complex types, define the subtypes:
332+
For complex types, define the subtypes. For `dict` and `tuple`, as more than
333+
one type is present, we use the brackets to help read the type (curly brackets
334+
for `dict` and normal brackets for `tuple`):
313335

314-
- list of [int]
336+
- list of int
315337
- dict of {str : int}
316338
- tuple of (str, int, int)
317-
- set of {str}
339+
- tuple of (str,)
340+
- set of str
318341

319-
In case there are just a set of values allowed, list them in curly brackets
320-
and separated by commas (followed by a space). If one of them is the default
321-
value of a keyword argument, it should be listed first.:
342+
In case where there are just a set of values allowed, list them in curly
343+
brackets and separated by commas (followed by a space). If the values are
344+
ordinal and they have an order, list them in this order. Otherwuse, list
345+
the default value first, if there is one:
322346

323347
- {0, 10, 25}
324348
- {'simple', 'advanced'}
349+
- {'low', 'medium', 'high'}
350+
- {'cat', 'dog', 'bird'}
325351

326352
If the type is defined in a Python module, the module must be specified:
327353

@@ -357,7 +383,7 @@ last two types, that need to be separated by the word 'or':
357383
- float, decimal.Decimal or None
358384
- str or list of str
359385

360-
If None is one of the accepted values, it always needs to be the last in
386+
If `None` is one of the accepted values, it always needs to be the last in
361387
the list.
362388

363389
.. _docstring.returns:
@@ -384,7 +410,8 @@ For example, with a single value:
384410
.. code-block:: python
385411
386412
def sample():
387-
"""Generate and return a random number.
413+
"""
414+
Generate and return a random number.
388415
389416
The value is sampled from a continuous uniform distribution between
390417
0 and 1.
@@ -401,7 +428,8 @@ With more than one value:
401428
.. code-block:: python
402429
403430
def random_letters():
404-
"""Generate and return a sequence of random letters.
431+
"""
432+
Generate and return a sequence of random letters.
405433
406434
The length of the returned string is also random, and is also
407435
returned.
@@ -423,7 +451,8 @@ If the method yields its value:
423451
.. code-block:: python
424452
425453
def sample_values():
426-
"""Generate an infinite sequence of random numbers.
454+
"""
455+
Generate an infinite sequence of random numbers.
427456
428457
The values are sampled from a continuous uniform distribution between
429458
0 and 1.
@@ -507,7 +536,9 @@ For example:
507536
508537
See Also
509538
--------
510-
tail : Return the last 5 elements of the Series.
539+
Series.tail : Return the last 5 elements of the Series.
540+
Series.iloc : Return a slice of the elements in the Series,
541+
which can also be used to return the first or last n.
511542
"""
512543
return self.iloc[:5]
513544
@@ -637,15 +668,17 @@ one structure is needed, name them with something meaningful, for example
637668
`df_main` and `df_to_join`.
638669

639670
Data used in the example should be as compact as possible. The number of rows
640-
is recommended to be 4, unless the example requires a larger number. As for
641-
example in the head method, where it requires to be higher than 5, to show
642-
the example with the default values.
643-
644-
Avoid using data without interpretation, like a matrix of random numbers
645-
with columns A, B, C, D... And instead use a meaningful example, which makes
646-
it easier to understand the concept. Unless required by the example, use
647-
names of animals, to keep examples consistent. And numerical properties of
648-
them.
671+
is recommended to be around 4, but make it a number that makes sense for the
672+
specific example. For example in the `head` method, it requires to be higher
673+
than 5, to show the example with the default values. If doing the `mean`,
674+
we could use something like `[1, 2, 3]`, so it is easy to see that the
675+
value returned is the mean.
676+
677+
For more complex examples (groupping for example), avoid using data without
678+
interpretation, like a matrix of random numbers with columns A, B, C, D...
679+
And instead use a meaningful example, which makes it easier to understand the
680+
concept. Unless required by the example, use names of animals, to keep examples
681+
consistent. And numerical properties of them.
649682

650683
When calling the method, keywords arguments `head(n=3)` are preferred to
651684
positional arguments `head(3)`.
@@ -654,23 +687,58 @@ positional arguments `head(3)`.
654687

655688
.. code-block:: python
656689
657-
def method():
658-
"""A sample DataFrame method.
690+
class Series:
691+
def mean(self):
692+
"""
693+
Compute the mean of the input.
659694
660-
Examples
661-
--------
662-
>>> df = pd.DataFrame([389., 24., 80.5, numpy.nan]
663-
... columns=('max_speed'),
664-
... index=['falcon', 'parrot', 'lion', 'monkey'])
665-
"""
666-
pass
695+
Examples
696+
--------
697+
>>> s = pd.Series([1, 2, 3])
698+
>>> s.mean()
699+
2
700+
"""
701+
pass
702+
703+
704+
def fillna(self, value):
705+
"""
706+
Replace missing values by `value`.
707+
708+
Examples
709+
--------
710+
>>> s = pd.Series([1, np.nan, 3])
711+
>>> s.fillna(9)
712+
[1, 9, 3]
713+
"""
714+
pass
715+
716+
def groupby_mean(df):
717+
"""
718+
Group by index and return mean.
719+
720+
Examples
721+
--------
722+
>>> s = pd.Series([380., 370., 24., 26],
723+
... name='max_speed',
724+
... index=['falcon', 'falcon', 'parrot', 'parrot'])
725+
>>> s.groupby_mean()
726+
falcon 375.
727+
parrot 25.
728+
"""
729+
pass
667730
668731
**Bad:**
669732

670733
.. code-block:: python
671734
672735
def method():
673-
"""A sample DataFrame method.
736+
"""
737+
A sample DataFrame method.
738+
739+
Do not import numpy and pandas.
740+
741+
Try to use meaningful data, when it adds value.
674742
675743
Examples
676744
--------

0 commit comments

Comments
 (0)