Skip to content

Commit e303a46

Browse files
FHaasejreback
authored andcommitted
DOC: Fix PEP-8 issues in computation.rst and comparison_*.rst (#24002)
1 parent 9d85b22 commit e303a46

File tree

4 files changed

+108
-105
lines changed

4 files changed

+108
-105
lines changed

doc/source/comparison_with_r.rst

+43-45
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
77
import pandas as pd
88
import numpy as np
9-
pd.options.display.max_rows=15
9+
pd.options.display.max_rows = 15
1010
1111
Comparison with R / R libraries
1212
*******************************
@@ -165,16 +165,15 @@ function.
165165

166166
.. ipython:: python
167167
168-
df = pd.DataFrame({
169-
'v1': [1,3,5,7,8,3,5,np.nan,4,5,7,9],
170-
'v2': [11,33,55,77,88,33,55,np.nan,44,55,77,99],
171-
'by1': ["red", "blue", 1, 2, np.nan, "big", 1, 2, "red", 1, np.nan, 12],
172-
'by2': ["wet", "dry", 99, 95, np.nan, "damp", 95, 99, "red", 99, np.nan,
173-
np.nan]
174-
})
168+
df = pd.DataFrame(
169+
{'v1': [1, 3, 5, 7, 8, 3, 5, np.nan, 4, 5, 7, 9],
170+
'v2': [11, 33, 55, 77, 88, 33, 55, np.nan, 44, 55, 77, 99],
171+
'by1': ["red", "blue", 1, 2, np.nan, "big", 1, 2, "red", 1, np.nan, 12],
172+
'by2': ["wet", "dry", 99, 95, np.nan, "damp", 95, 99, "red", 99, np.nan,
173+
np.nan]})
175174
176-
g = df.groupby(['by1','by2'])
177-
g[['v1','v2']].mean()
175+
g = df.groupby(['by1', 'by2'])
176+
g[['v1', 'v2']].mean()
178177
179178
For more details and examples see :ref:`the groupby documentation
180179
<groupby.split>`.
@@ -195,7 +194,7 @@ The :meth:`~pandas.DataFrame.isin` method is similar to R ``%in%`` operator:
195194

196195
.. ipython:: python
197196
198-
s = pd.Series(np.arange(5),dtype=np.float32)
197+
s = pd.Series(np.arange(5), dtype=np.float32)
199198
s.isin([2, 4])
200199
201200
The ``match`` function returns a vector of the positions of matches
@@ -234,11 +233,11 @@ In ``pandas`` we may use :meth:`~pandas.pivot_table` method to handle this:
234233
import random
235234
import string
236235
237-
baseball = pd.DataFrame({
238-
'team': ["team %d" % (x+1) for x in range(5)]*5,
239-
'player': random.sample(list(string.ascii_lowercase),25),
240-
'batting avg': np.random.uniform(.200, .400, 25)
241-
})
236+
baseball = pd.DataFrame(
237+
{'team': ["team %d" % (x + 1) for x in range(5)] * 5,
238+
'player': random.sample(list(string.ascii_lowercase), 25),
239+
'batting avg': np.random.uniform(.200, .400, 25)})
240+
242241
baseball.pivot_table(values='batting avg', columns='team', aggfunc=np.max)
243242
244243
For more details and examples see :ref:`the reshaping documentation
@@ -341,15 +340,13 @@ In ``pandas`` the equivalent expression, using the
341340

342341
.. ipython:: python
343342
344-
df = pd.DataFrame({
345-
'x': np.random.uniform(1., 168., 120),
346-
'y': np.random.uniform(7., 334., 120),
347-
'z': np.random.uniform(1.7, 20.7, 120),
348-
'month': [5,6,7,8]*30,
349-
'week': np.random.randint(1,4, 120)
350-
})
343+
df = pd.DataFrame({'x': np.random.uniform(1., 168., 120),
344+
'y': np.random.uniform(7., 334., 120),
345+
'z': np.random.uniform(1.7, 20.7, 120),
346+
'month': [5, 6, 7, 8] * 30,
347+
'week': np.random.randint(1, 4, 120)})
351348
352-
grouped = df.groupby(['month','week'])
349+
grouped = df.groupby(['month', 'week'])
353350
grouped['x'].agg([np.mean, np.std])
354351
355352
@@ -374,8 +371,8 @@ In Python, since ``a`` is a list, you can simply use list comprehension.
374371

375372
.. ipython:: python
376373
377-
a = np.array(list(range(1,24))+[np.NAN]).reshape(2,3,4)
378-
pd.DataFrame([tuple(list(x)+[val]) for x, val in np.ndenumerate(a)])
374+
a = np.array(list(range(1, 24)) + [np.NAN]).reshape(2, 3, 4)
375+
pd.DataFrame([tuple(list(x) + [val]) for x, val in np.ndenumerate(a)])
379376
380377
|meltlist|_
381378
~~~~~~~~~~~~
@@ -393,7 +390,7 @@ In Python, this list would be a list of tuples, so
393390

394391
.. ipython:: python
395392
396-
a = list(enumerate(list(range(1,5))+[np.NAN]))
393+
a = list(enumerate(list(range(1, 5)) + [np.NAN]))
397394
pd.DataFrame(a)
398395
399396
For more details and examples see :ref:`the Into to Data Structures
@@ -419,12 +416,13 @@ In Python, the :meth:`~pandas.melt` method is the R equivalent:
419416

420417
.. ipython:: python
421418
422-
cheese = pd.DataFrame({'first' : ['John', 'Mary'],
423-
'last' : ['Doe', 'Bo'],
424-
'height' : [5.5, 6.0],
425-
'weight' : [130, 150]})
419+
cheese = pd.DataFrame({'first': ['John', 'Mary'],
420+
'last': ['Doe', 'Bo'],
421+
'height': [5.5, 6.0],
422+
'weight': [130, 150]})
423+
426424
pd.melt(cheese, id_vars=['first', 'last'])
427-
cheese.set_index(['first', 'last']).stack() # alternative way
425+
cheese.set_index(['first', 'last']).stack() # alternative way
428426
429427
For more details and examples see :ref:`the reshaping documentation
430428
<reshaping.melt>`.
@@ -452,16 +450,15 @@ In Python the best way is to make use of :meth:`~pandas.pivot_table`:
452450

453451
.. ipython:: python
454452
455-
df = pd.DataFrame({
456-
'x': np.random.uniform(1., 168., 12),
457-
'y': np.random.uniform(7., 334., 12),
458-
'z': np.random.uniform(1.7, 20.7, 12),
459-
'month': [5,6,7]*4,
460-
'week': [1,2]*6
461-
})
453+
df = pd.DataFrame({'x': np.random.uniform(1., 168., 12),
454+
'y': np.random.uniform(7., 334., 12),
455+
'z': np.random.uniform(1.7, 20.7, 12),
456+
'month': [5, 6, 7] * 4,
457+
'week': [1, 2] * 6})
458+
462459
mdf = pd.melt(df, id_vars=['month', 'week'])
463-
pd.pivot_table(mdf, values='value', index=['variable','week'],
464-
columns=['month'], aggfunc=np.mean)
460+
pd.pivot_table(mdf, values='value', index=['variable', 'week'],
461+
columns=['month'], aggfunc=np.mean)
465462
466463
Similarly for ``dcast`` which uses a data.frame called ``df`` in R to
467464
aggregate information based on ``Animal`` and ``FeedType``:
@@ -491,13 +488,14 @@ using :meth:`~pandas.pivot_table`:
491488
'Amount': [10, 7, 4, 2, 5, 6, 2],
492489
})
493490
494-
df.pivot_table(values='Amount', index='Animal', columns='FeedType', aggfunc='sum')
491+
df.pivot_table(values='Amount', index='Animal', columns='FeedType',
492+
aggfunc='sum')
495493
496494
The second approach is to use the :meth:`~pandas.DataFrame.groupby` method:
497495

498496
.. ipython:: python
499497
500-
df.groupby(['Animal','FeedType'])['Amount'].sum()
498+
df.groupby(['Animal', 'FeedType'])['Amount'].sum()
501499
502500
For more details and examples see :ref:`the reshaping documentation
503501
<reshaping.pivot>` or :ref:`the groupby documentation<groupby.split>`.
@@ -516,8 +514,8 @@ In pandas this is accomplished with ``pd.cut`` and ``astype("category")``:
516514

517515
.. ipython:: python
518516
519-
pd.cut(pd.Series([1,2,3,4,5,6]), 3)
520-
pd.Series([1,2,3,2,2,3]).astype("category")
517+
pd.cut(pd.Series([1, 2, 3, 4, 5, 6]), 3)
518+
pd.Series([1, 2, 3, 2, 2, 3]).astype("category")
521519
522520
For more details and examples see :ref:`categorical introduction <categorical>` and the
523521
:ref:`API documentation <api.categorical>`. There is also a documentation regarding the

doc/source/comparison_with_sql.rst

+9-11
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ structure.
2323

2424
.. ipython:: python
2525
26-
url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
26+
url = ('https://raw.github.com/pandas-dev'
27+
'/pandas/master/pandas/tests/data/tips.csv')
2728
tips = pd.read_csv(url)
2829
tips.head()
2930
@@ -387,7 +388,7 @@ Top N rows with offset
387388
388389
.. ipython:: python
389390
390-
tips.nlargest(10+5, columns='tip').tail(10)
391+
tips.nlargest(10 + 5, columns='tip').tail(10)
391392
392393
Top N rows per group
393394
~~~~~~~~~~~~~~~~~~~~
@@ -411,8 +412,7 @@ Top N rows per group
411412
.groupby(['day'])
412413
.cumcount() + 1)
413414
.query('rn < 3')
414-
.sort_values(['day','rn'])
415-
)
415+
.sort_values(['day', 'rn']))
416416
417417
the same using `rank(method='first')` function
418418

@@ -421,8 +421,7 @@ the same using `rank(method='first')` function
421421
(tips.assign(rnk=tips.groupby(['day'])['total_bill']
422422
.rank(method='first', ascending=False))
423423
.query('rnk < 3')
424-
.sort_values(['day','rnk'])
425-
)
424+
.sort_values(['day', 'rnk']))
426425
427426
.. code-block:: sql
428427
@@ -445,11 +444,10 @@ Notice that when using ``rank(method='min')`` function
445444
.. ipython:: python
446445
447446
(tips[tips['tip'] < 2]
448-
.assign(rnk_min=tips.groupby(['sex'])['tip']
449-
.rank(method='min'))
450-
.query('rnk_min < 3')
451-
.sort_values(['sex','rnk_min'])
452-
)
447+
.assign(rnk_min=tips.groupby(['sex'])['tip']
448+
.rank(method='min'))
449+
.query('rnk_min < 3')
450+
.sort_values(['sex', 'rnk_min']))
453451
454452
455453
UPDATE

doc/source/comparison_with_stata.rst

+11-12
Original file line numberDiff line numberDiff line change
@@ -102,9 +102,7 @@ and the values are the data.
102102

103103
.. ipython:: python
104104
105-
df = pd.DataFrame({
106-
'x': [1, 3, 5],
107-
'y': [2, 4, 6]})
105+
df = pd.DataFrame({'x': [1, 3, 5], 'y': [2, 4, 6]})
108106
df
109107
110108
@@ -128,7 +126,8 @@ the data set if presented with a url.
128126

129127
.. ipython:: python
130128
131-
url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
129+
url = ('https://raw.github.com/pandas-dev'
130+
'/pandas/master/pandas/tests/data/tips.csv')
132131
tips = pd.read_csv(url)
133132
tips.head()
134133
@@ -278,17 +277,17 @@ see the :ref:`timeseries documentation<timeseries>` for more details.
278277
tips['date1_year'] = tips['date1'].dt.year
279278
tips['date2_month'] = tips['date2'].dt.month
280279
tips['date1_next'] = tips['date1'] + pd.offsets.MonthBegin()
281-
tips['months_between'] = (tips['date2'].dt.to_period('M') -
282-
tips['date1'].dt.to_period('M'))
280+
tips['months_between'] = (tips['date2'].dt.to_period('M')
281+
- tips['date1'].dt.to_period('M'))
283282
284-
tips[['date1','date2','date1_year','date2_month',
285-
'date1_next','months_between']].head()
283+
tips[['date1', 'date2', 'date1_year', 'date2_month', 'date1_next',
284+
'months_between']].head()
286285
287286
.. ipython:: python
288287
:suppress:
289288
290-
tips = tips.drop(['date1','date2','date1_year',
291-
'date2_month','date1_next','months_between'], axis=1)
289+
tips = tips.drop(['date1', 'date2', 'date1_year', 'date2_month',
290+
'date1_next', 'months_between'], axis=1)
292291
293292
Selection of Columns
294293
~~~~~~~~~~~~~~~~~~~~
@@ -472,7 +471,7 @@ The following tables will be used in the merge examples
472471
'value': np.random.randn(4)})
473472
df1
474473
df2 = pd.DataFrame({'key': ['B', 'D', 'D', 'E'],
475-
'value': np.random.randn(4)})
474+
'value': np.random.randn(4)})
476475
df2
477476
478477
In Stata, to perform a merge, one data set must be in memory
@@ -661,7 +660,7 @@ In pandas this would be written as:
661660

662661
.. ipython:: python
663662
664-
tips.groupby(['sex','smoker']).first()
663+
tips.groupby(['sex', 'smoker']).first()
665664
666665
667666
Other Considerations

0 commit comments

Comments
 (0)