Skip to content

Commit eb9fb82

Browse files
committed
fixed indentation and import
1 parent 8e59f1e commit eb9fb82

File tree

1 file changed

+93
-93
lines changed

1 file changed

+93
-93
lines changed

doc/source/comparison_with_r.rst

+93-93
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,8 @@
44
.. ipython:: python
55
:suppress:
66
7-
from pandas import *
8-
import numpy.random as random
9-
from numpy import *
7+
import pandas as pd
8+
import numpy as np
109
options.display.max_rows=15
1110
1211
Comparison with R / R libraries
@@ -40,25 +39,25 @@ The :meth:`~pandas.DataFrame.query` method is similar to the base R ``subset``
4039
function. In R you might want to get the rows of a ``data.frame`` where one
4140
column's values are less than another column's values:
4241

43-
.. code-block:: r
42+
.. code-block:: r
4443
45-
df <- data.frame(a=rnorm(10), b=rnorm(10))
46-
subset(df, a <= b)
47-
df[df$a <= df$b,] # note the comma
44+
df <- data.frame(a=rnorm(10), b=rnorm(10))
45+
subset(df, a <= b)
46+
df[df$a <= df$b,] # note the comma
4847
4948
In ``pandas``, there are a few ways to perform subsetting. You can use
5049
:meth:`~pandas.DataFrame.query` or pass an expression as if it were an
5150
index/slice as well as standard boolean indexing:
5251

53-
.. ipython:: python
52+
.. ipython:: python
5453
55-
from pandas import DataFrame
56-
from numpy.random import randn
54+
from pandas import DataFrame
55+
from numpy import random
5756
58-
df = DataFrame({'a': randn(10), 'b': randn(10)})
59-
df.query('a <= b')
60-
df[df.a <= df.b]
61-
df.loc[df.a <= df.b]
57+
df = DataFrame({'a': random.randn(10), 'b': random.randn(10)})
58+
df.query('a <= b')
59+
df[df.a <= df.b]
60+
df.loc[df.a <= df.b]
6261
6362
For more details and examples see :ref:`the query documentation
6463
<indexing.query>`.
@@ -72,20 +71,20 @@ For more details and examples see :ref:`the query documentation
7271
An expression using a data.frame called ``df`` in R with the columns ``a`` and
7372
``b`` would be evaluated using ``with`` like so:
7473

75-
.. code-block:: r
74+
.. code-block:: r
7675
77-
df <- data.frame(a=rnorm(10), b=rnorm(10))
78-
with(df, a + b)
79-
df$a + df$b # same as the previous expression
76+
df <- data.frame(a=rnorm(10), b=rnorm(10))
77+
with(df, a + b)
78+
df$a + df$b # same as the previous expression
8079
8180
In ``pandas`` the equivalent expression, using the
8281
:meth:`~pandas.DataFrame.eval` method, would be:
8382

84-
.. ipython:: python
83+
.. ipython:: python
8584
86-
df = DataFrame({'a': randn(10), 'b': randn(10)})
87-
df.eval('a + b')
88-
df.a + df.b # same as the previous expression
85+
df = DataFrame({'a': random.randn(10), 'b': random.randn(10)})
86+
df.eval('a + b')
87+
df.a + df.b # same as the previous expression
8988
9089
In certain cases :meth:`~pandas.DataFrame.eval` will be much faster than
9190
evaluation in pure Python. For more details and examples see :ref:`the eval
@@ -123,38 +122,38 @@ summarize ``x`` by ``month``:
123122

124123

125124

126-
.. code-block:: r
125+
.. code-block:: r
127126
128-
require(plyr)
129-
df <- data.frame(
130-
x = runif(120, 1, 168),
131-
y = runif(120, 7, 334),
132-
z = runif(120, 1.7, 20.7),
133-
month = rep(c(5,6,7,8),30),
134-
week = sample(1:4, 120, TRUE)
135-
)
127+
require(plyr)
128+
df <- data.frame(
129+
x = runif(120, 1, 168),
130+
y = runif(120, 7, 334),
131+
z = runif(120, 1.7, 20.7),
132+
month = rep(c(5,6,7,8),30),
133+
week = sample(1:4, 120, TRUE)
134+
)
136135
137-
ddply(df, .(month, week), summarize,
138-
mean = round(mean(x), 2),
139-
sd = round(sd(x), 2))
136+
ddply(df, .(month, week), summarize,
137+
mean = round(mean(x), 2),
138+
sd = round(sd(x), 2))
140139
141140
In ``pandas`` the equivalent expression, using the
142141
:meth:`~pandas.DataFrame.groupby` method, would be:
143142

144143

145144

146-
.. ipython:: python
145+
.. ipython:: python
147146
148-
df = DataFrame({
149-
'x': random.uniform(1., 168., 120),
150-
'y': random.uniform(7., 334., 120),
151-
'z': random.uniform(1.7, 20.7, 120),
152-
'month': [5,6,7,8]*30,
153-
'week': random.randint(1,4, 120)
154-
})
147+
df = DataFrame({
148+
'x': random.uniform(1., 168., 120),
149+
'y': random.uniform(7., 334., 120),
150+
'z': random.uniform(1.7, 20.7, 120),
151+
'month': [5,6,7,8]*30,
152+
'week': random.randint(1,4, 120)
153+
})
155154
156-
grouped = df.groupby(['month','week'])
157-
print grouped['x'].agg([mean, std])
155+
grouped = df.groupby(['month','week'])
156+
print grouped['x'].agg([np.mean, np.std])
158157
159158
160159
For more details and examples see :ref:`the groupby documentation
@@ -169,35 +168,36 @@ reshape / reshape2
169168
An expression using a 3 dimensional array called ``a`` in R where you want to
170169
melt it into a data.frame:
171170

172-
.. code-block:: r
171+
.. code-block:: r
173172
174-
a <- array(c(1:23, NA), c(2,3,4))
175-
data.frame(melt(a))
173+
a <- array(c(1:23, NA), c(2,3,4))
174+
data.frame(melt(a))
176175
177176
In Python, since ``a`` is a list, you can simply use list comprehension.
178177

179-
.. ipython:: python
180-
a = array(range(1,24)+[NAN]).reshape(2,3,4)
181-
DataFrame([tuple(list(x)+[val]) for x, val in ndenumerate(a)])
178+
.. ipython:: python
179+
180+
a = np.array(range(1,24)+[np.NAN]).reshape(2,3,4)
181+
DataFrame([tuple(list(x)+[val]) for x, val in np.ndenumerate(a)])
182182
183183
|meltlist|_
184184
~~~~~~~~~~~~
185185

186186
An expression using a list called ``a`` in R where you want to melt it
187187
into a data.frame:
188188

189-
.. code-block:: r
189+
.. code-block:: r
190190
191-
a <- as.list(c(1:4, NA))
192-
data.frame(melt(a))
191+
a <- as.list(c(1:4, NA))
192+
data.frame(melt(a))
193193
194194
In Python, this list would be a list of tuples, so
195195
:meth:`~pandas.DataFrame` method would convert it to a dataframe as required.
196196

197-
.. ipython:: python
197+
.. ipython:: python
198198
199-
a = list(enumerate(range(1,5)+[NAN]))
200-
DataFrame(a)
199+
a = list(enumerate(range(1,5)+[np.NAN]))
200+
DataFrame(a)
201201
202202
For more details and examples see :ref:`the Into to Data Structures
203203
documentation <basics.dataframe.from_items>`.
@@ -208,26 +208,26 @@ documentation <basics.dataframe.from_items>`.
208208
An expression using a data.frame called ``cheese`` in R where you want to
209209
reshape the data.frame:
210210

211-
.. code-block:: r
211+
.. code-block:: r
212212
213-
cheese <- data.frame(
214-
first = c('John, Mary'),
215-
last = c('Doe', 'Bo'),
216-
height = c(5.5, 6.0),
217-
weight = c(130, 150)
218-
)
219-
melt(cheese, id=c("first", "last"))
213+
cheese <- data.frame(
214+
first = c('John, Mary'),
215+
last = c('Doe', 'Bo'),
216+
height = c(5.5, 6.0),
217+
weight = c(130, 150)
218+
)
219+
melt(cheese, id=c("first", "last"))
220220
221221
In Python, the :meth:`~pandas.melt` method is the R equivalent:
222222

223-
.. ipython:: python
223+
.. ipython:: python
224224
225-
cheese = DataFrame({'first' : ['John', 'Mary'],
226-
'last' : ['Doe', 'Bo'],
227-
'height' : [5.5, 6.0],
228-
'weight' : [130, 150]})
229-
melt(cheese, id_vars=['first', 'last'])
230-
cheese.set_index(['first', 'last']).stack() # alternative way
225+
cheese = DataFrame({'first' : ['John', 'Mary'],
226+
'last' : ['Doe', 'Bo'],
227+
'height' : [5.5, 6.0],
228+
'weight' : [130, 150]})
229+
pd.melt(cheese, id_vars=['first', 'last'])
230+
cheese.set_index(['first', 'last']).stack() # alternative way
231231
232232
For more details and examples see :ref:`the reshaping documentation
233233
<reshaping.melt>`.
@@ -238,33 +238,33 @@ For more details and examples see :ref:`the reshaping documentation
238238
An expression using a data.frame called ``df`` in R to cast into a higher
239239
dimensional array:
240240

241-
.. code-block:: r
241+
.. code-block:: r
242242
243-
df <- data.frame(
244-
x = runif(12, 1, 168),
245-
y = runif(12, 7, 334),
246-
z = runif(12, 1.7, 20.7),
247-
month = rep(c(5,6,7),4),
248-
week = rep(c(1,2), 6)
249-
)
243+
df <- data.frame(
244+
x = runif(12, 1, 168),
245+
y = runif(12, 7, 334),
246+
z = runif(12, 1.7, 20.7),
247+
month = rep(c(5,6,7),4),
248+
week = rep(c(1,2), 6)
249+
)
250250
251-
mdf <- melt(df, id=c("month", "week"))
252-
acast(mdf, week ~ month ~ variable, mean)
251+
mdf <- melt(df, id=c("month", "week"))
252+
acast(mdf, week ~ month ~ variable, mean)
253253
254254
In Python the best way is to make use of :meth:`~pandas.pivot_table`:
255255

256-
.. ipython:: python
257-
258-
df = DataFrame({
259-
'x': random.uniform(1., 168., 12),
260-
'y': random.uniform(7., 334., 12),
261-
'z': random.uniform(1.7, 20.7, 12),
262-
'month': [5,6,7]*4,
263-
'week': [1,2]*6
264-
})
265-
mdf = melt(df, id_vars=['month', 'week'])
266-
pivot_table(mdf, values='value', rows=['variable','week'],
267-
cols=['month'], aggfunc=mean)
256+
.. ipython:: python
257+
258+
df = DataFrame({
259+
'x': random.uniform(1., 168., 12),
260+
'y': random.uniform(7., 334., 12),
261+
'z': random.uniform(1.7, 20.7, 12),
262+
'month': [5,6,7]*4,
263+
'week': [1,2]*6
264+
})
265+
mdf = pd.melt(df, id_vars=['month', 'week'])
266+
pd.pivot_table(mdf, values='value', rows=['variable','week'],
267+
cols=['month'], aggfunc=np.mean)
268268
269269
For more details and examples see :ref:`the reshaping documentation
270270
<reshaping.pivot>`.

0 commit comments

Comments
 (0)