Skip to content

Commit 9d2b3e1

Browse files
author
Nick Eubank
committed
fixed small issues jorvisvandenbossche noted
1 parent 4389271 commit 9d2b3e1

File tree

3 files changed

+40
-40
lines changed

3 files changed

+40
-40
lines changed

doc/source/indexing.rst

+28-28
Original file line numberDiff line numberDiff line change
@@ -516,29 +516,29 @@ A random selection of rows or columns from a Series, DataFrame, or Panel with th
516516

517517
.. ipython :: python
518518
519-
s = Series([0,1,2,3,4,5])
519+
s = Series([0,1,2,3,4,5])
520520
521-
# When no arguments are passed, returns 1 row.
522-
s.sample()
523-
524-
# One may specify either a number of rows:
525-
s.sample(n = 3)
521+
# When no arguments are passed, returns 1 row.
522+
s.sample()
523+
524+
# One may specify either a number of rows:
525+
s.sample(n=3)
526526
527-
# Or a fraction of the rows:
528-
s.sample(frac = 0.5)
527+
# Or a fraction of the rows:
528+
s.sample(frac=0.5)
529529
530530
By default, ``sample`` will return each row at most once, but one can also sample with replacement
531531
using the ``replace`` option:
532532

533533
.. ipython :: python
534534
535535
s = Series([0,1,2,3,4,5])
536-
537-
# Without replacement (default):
538-
s.sample(n = 6, replace = False)
539-
540-
# With replacement:
541-
s.sample(n = 6, replace = True)
536+
537+
# Without replacement (default):
538+
s.sample(n=6, replace=False)
539+
540+
# With replacement:
541+
s.sample(n=6, replace=True)
542542
543543
544544
By default, each row has an equal probability of being selected, but if you want rows
@@ -549,37 +549,37 @@ to have different probabilities, you can pass the ``sample`` function sampling w
549549
550550
s = Series([0,1,2,3,4,5])
551551
example_weights = [0, 0, 0.2, 0.2, 0.2, 0.4]
552-
s.sample(n=3, weights = example_weights)
553-
554-
# Weights will be re-normalized automatically
555-
example_weights2 = [0.5, 0, 0, 0, 0, 0]
556-
s.sample(n=1, weights= example_weights2)
552+
s.sample(n=3, weights=example_weights)
553+
554+
# Weights will be re-normalized automatically
555+
example_weights2 = [0.5, 0, 0, 0, 0, 0]
556+
s.sample(n=1, weights=example_weights2)
557557
558558
When applied to a DataFrame, you can use a column of the DataFrame as sampling weights
559559
(provided you are sampling rows and not columns) by simply passing the name of the column
560560
as a string.
561-
561+
562562
.. ipython :: python
563563
564-
df2 = DataFrame({'col1':[9,8,7,6], 'weight_column':[0.5, 0.4, 0.1, 0]})
565-
df2.sample(n = 3, weights = 'weight_column')
564+
df2 = DataFrame({'col1':[9,8,7,6], 'weight_column':[0.5, 0.4, 0.1, 0]})
565+
df2.sample(n = 3, weights = 'weight_column')
566566
567567
``sample`` also allows users to sample columns instead of rows using the ``axis`` argument.
568568

569569
.. ipython :: python
570570
571-
df3 = DataFrame({'col1':[1,2,3], 'col2':[2,3,4]})
572-
df3.sample(n=1, axis = 1)
571+
df3 = DataFrame({'col1':[1,2,3], 'col2':[2,3,4]})
572+
df3.sample(n=1, axis=1)
573573
574574
Finally, one can also set a seed for ``sample``'s random number generator using the ``random_state`` argument, which will accept either an integer (as a seed) or a numpy RandomState object.
575575

576576
.. ipython :: python
577577
578-
df4 = DataFrame({'col1':[1,2,3], 'col2':[2,3,4]})
578+
df4 = DataFrame({'col1':[1,2,3], 'col2':[2,3,4]})
579579
580-
# With a given seed, the sample will always draw the same rows.
581-
df4.sample(n=2, random_state = 2)
582-
df4.sample(n=2, random_state = 2)
580+
# With a given seed, the sample will always draw the same rows.
581+
df4.sample(n=2, random_state=2)
582+
df4.sample(n=2, random_state=2)
583583
584584
585585

doc/source/whatsnew/v0.16.1.txt

+9-8
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ Highlights include:
2020

2121
Enhancements
2222
~~~~~~~~~~~~
23-
.. _whatsnew_0161.enhancements.sample:
2423

2524
- Added ``StringMethods.capitalize()`` and ``swapcase`` which behave as the same as standard ``str`` (:issue:`9766`)
2625
- Added ``StringMethods`` (.str accessor) to ``Index`` (:issue:`9068`)
@@ -136,10 +135,12 @@ values NOT in the categories, similarly to how you can reindex ANY pandas index.
136135

137136
See the :ref:`documentation <advanced.categoricalindex>` for more. (:issue:`7629`)
138137

138+
.. _whatsnew_0161.enhancements.sample:
139+
139140
Sample
140141
^^^^^^^^^^^^^^^^
141142

142-
Series, DataFrames, and Panels now have a new method: :meth:`~pandas.core.sample`.
143+
Series, DataFrames, and Panels now have a new method: :meth:`~pandas.DataFrame.sample`.
143144
The method accepts a specific number of rows or columns to return, or a fraction of the
144145
total number or rows or columns. It also has options for sampling with or without replacement,
145146
for passing in a column for weights for non-uniform sampling, and for setting seed values to facilitate replication.
@@ -148,23 +149,23 @@ for passing in a column for weights for non-uniform sampling, and for setting se
148149

149150
example_series = Series([0,1,2,3,4,5])
150151

151-
# When no arguments are passed, returns 5 rows like .head() or .tail()
152+
# When no arguments are passed, returns 1
152153
example_series.sample()
153154

154155
# One may specify either a number of rows:
155-
example_series.sample(n = 3)
156+
example_series.sample(n=3)
156157

157158
# Or a fraction of the rows:
158-
example_series.sample(frac = 0.5)
159+
example_series.sample(frac=0.5)
159160

160161
# weights are accepted.
161162
example_weights = [0, 0, 0.2, 0.2, 0.2, 0.4]
162-
example_series.sample(n=3, weights = example_weights)
163+
example_series.sample(n=3, weights=example_weights)
163164

164165
# weights will also be normalized if they do not sum to one,
165166
# and missing values will be treated as zeros.
166167
example_weights2 = [0.5, 0, 0, 0, None, np.nan]
167-
example_series.sample(n=1, weights = example_weights2)
168+
example_series.sample(n=1, weights=example_weights2)
168169

169170

170171
When applied to a DataFrame, one may pass the name of a column to specify sampling weights,
@@ -173,7 +174,7 @@ although note that the value of the weights column must sum to one.
173174
.. ipython :: python
174175

175176
df = DataFrame({'col1':[9,8,7,6], 'weight_column':[0.5, 0.4, 0.1, 0]})
176-
df.sample(n = 3, weights = 'weight_column')
177+
df.sample(n=3, weights='weight_column')
177178

178179
.. _whatsnew_0161.api:
179180

pandas/core/generic.py

+3-4
Original file line numberDiff line numberDiff line change
@@ -1959,7 +1959,7 @@ def sample(self, n=None, frac=None, replace=False, weights=None, random_state=No
19591959
Number of rows to return. Cannot be used with `frac`.
19601960
Default = 1 if `frac` = None.
19611961
frac : float, optional
1962-
Share of rows to return. Cannot be used with `n`.
1962+
Fraction of rows to return. Cannot be used with `n`.
19631963
replace : boolean, optional
19641964
Sample with or without replacement. Default = False.
19651965
weights : str or ndarray-like, optional
@@ -2010,7 +2010,7 @@ def sample(self, n=None, frac=None, replace=False, weights=None, random_state=No
20102010
try:
20112011
weights = self[weights]
20122012
except KeyError:
2013-
raise KeyError("String passed to weights not a valid column name")
2013+
raise KeyError("String passed to weights not a valid name for an item in specified axis")
20142014

20152015
else:
20162016
raise ValueError("Strings cannot be passed as weights when sampling from a Series.")
@@ -2022,8 +2022,7 @@ def sample(self, n=None, frac=None, replace=False, weights=None, random_state=No
20222022
if len(weights) != axis_length:
20232023
raise ValueError("Weights and axis to be sampled must be of same length")
20242024

2025-
# No infs allowed. The np.nan_to_num() command below would make these large values
2026-
# which is pretty unintuitive.
2025+
# No infs allowed.
20272026
if (weights == np.inf).any() or (weights == -np.inf).any():
20282027
raise ValueError("weight vector may not include `inf` values")
20292028

0 commit comments

Comments
 (0)