Skip to content

Commit d500806

Browse files
dukebodyjorisvandenbossche
authored andcommitted
pandas-docstrings: Specify when random data in examples might be OK (#77)
1 parent 9989bfa commit d500806

File tree

4 files changed

+31
-6
lines changed

4 files changed

+31
-6
lines changed

pandas/guide/_sources/pandas_docstring.rst.txt

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -259,7 +259,7 @@ After the title, each parameter in the signature must be documented, including
259259
The parameters are defined by their name, followed by a space, a colon, another
260260
space, and the type (or types). Note that the space between the name and the
261261
colon is important. Types are not defined for `*args` and `**kwargs`, but must
262-
be defined for all other parameters. After the parameter definition, it is
262+
be defined for all other parameters. After the parameter definition, it is
263263
required to have a line with the parameter description, which is indented, and
264264
can have multiple lines. The description must start with a capital letter, and
265265
finish with a dot.
@@ -840,7 +840,15 @@ be tricky. Here are some attention points:
840840
imported as ``import pandas as pd`` and ``import numpy as np``) and define
841841
all variables you use in the example.
842842

843-
* Try to avoid using random data.
843+
* Try to avoid using random data. However random data might be OK in some
844+
cases, like if the function you are documenting deals with probability
845+
distributions, or if the amount of data needed to make the function result
846+
meaningful is too much, such that creating it manually is very cumbersome.
847+
In those cases, always use a fixed random seed to make the generated examples
848+
predictable. Example::
849+
850+
>>> np.random.seed(42)
851+
>>> df = pd.DataFrame({'normal': np.random.normal(100, 5, 20)})
844852

845853
* If you have a code snippet that wraps multiple lines, you need to use '...'
846854
on the continued lines: ::

pandas/guide/pandas_docstring.html

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -764,7 +764,16 @@ <h2>About docstrings and standards<a class="headerlink" href="#about-docstrings-
764764
imported as <code class="docutils literal notranslate"><span class="pre">import</span> <span class="pre">pandas</span> <span class="pre">as</span> <span class="pre">pd</span></code> and <code class="docutils literal notranslate"><span class="pre">import</span> <span class="pre">numpy</span> <span class="pre">as</span> <span class="pre">np</span></code>) and define
765765
all variables you use in the example.</p>
766766
</li>
767-
<li><p class="first">Try to avoid using random data.</p>
767+
<li><p class="first">Try to avoid using random data. However random data might be OK in some
768+
cases, like if the function you are documenting deals with probability
769+
distributions, or if the amount of data needed to make the function result
770+
meaningful is too much, such that creating it manually is very cumbersome.
771+
In those cases, always use a fixed random seed to make the generated examples
772+
predictable. Example:</p>
773+
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
774+
<span class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span><span class="s1">&#39;normal&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">20</span><span class="p">)})</span>
775+
</pre></div>
776+
</div>
768777
</li>
769778
<li><p class="first">If you have a code snippet that wraps multiple lines, you need to use ‘…’
770779
on the continued lines:</p>

0 commit comments

Comments
 (0)