@@ -67,31 +67,31 @@ Filtering in SQL is done via a WHERE clause.
67
67
68
68
.. include :: includes/filtering.rst
69
69
70
- Just like SQL's OR and AND, multiple conditions can be passed to a DataFrame using | (OR) and &
71
- (AND).
70
+ Just like SQL's ``OR `` and ``AND ``, multiple conditions can be passed to a DataFrame using ``| ``
71
+ (``OR ``) and ``& `` (``AND ``).
72
+
73
+ Tips of more than $5 at Dinner meals:
72
74
73
75
.. code-block :: sql
74
76
75
- -- tips of more than $5.00 at Dinner meals
76
77
SELECT *
77
78
FROM tips
78
79
WHERE time = 'Dinner' AND tip > 5.00;
79
80
80
81
.. ipython :: python
81
82
82
- # tips of more than $5.00 at Dinner meals
83
83
tips[(tips[" time" ] == " Dinner" ) & (tips[" tip" ] > 5.00 )]
84
84
85
+ Tips by parties of at least 5 diners OR bill total was more than $45:
86
+
85
87
.. code-block :: sql
86
88
87
- -- tips by parties of at least 5 diners OR bill total was more than $45
88
89
SELECT *
89
90
FROM tips
90
91
WHERE size >= 5 OR total_bill > 45;
91
92
92
93
.. ipython :: python
93
94
94
- # tips by parties of at least 5 diners OR bill total was more than $45
95
95
tips[(tips[" size" ] >= 5 ) | (tips[" total_bill" ] > 45 )]
96
96
97
97
NULL checking is done using the :meth: `~pandas.Series.notna ` and :meth: `~pandas.Series.isna `
@@ -132,7 +132,7 @@ Getting items where ``col1`` IS NOT NULL can be done with :meth:`~pandas.Series.
132
132
133
133
GROUP BY
134
134
--------
135
- In pandas, SQL's GROUP BY operations are performed using the similarly named
135
+ In pandas, SQL's `` GROUP BY `` operations are performed using the similarly named
136
136
:meth: `~pandas.DataFrame.groupby ` method. :meth: `~pandas.DataFrame.groupby ` typically refers to a
137
137
process where we'd like to split a dataset into groups, apply some function (typically aggregation)
138
138
, and then combine the groups together.
@@ -160,7 +160,7 @@ The pandas equivalent would be:
160
160
Notice that in the pandas code we used :meth: `~pandas.core.groupby.DataFrameGroupBy.size ` and not
161
161
:meth: `~pandas.core.groupby.DataFrameGroupBy.count `. This is because
162
162
:meth: `~pandas.core.groupby.DataFrameGroupBy.count ` applies the function to each column, returning
163
- the number of ``not null `` records within each.
163
+ the number of ``NOT NULL `` records within each.
164
164
165
165
.. ipython :: python
166
166
@@ -221,10 +221,10 @@ Grouping by more than one column is done by passing a list of columns to the
221
221
222
222
JOIN
223
223
----
224
- JOINs can be performed with :meth: `~pandas.DataFrame.join ` or :meth: `~pandas.merge `. By default,
225
- :meth: `~pandas.DataFrame.join ` will join the DataFrames on their indices. Each method has
226
- parameters allowing you to specify the type of join to perform (LEFT, RIGHT, INNER, FULL) or the
227
- columns to join on (column names or indices).
224
+ `` JOIN `` \s can be performed with :meth: `~pandas.DataFrame.join ` or :meth: `~pandas.merge `. By
225
+ default, :meth: `~pandas.DataFrame.join ` will join the DataFrames on their indices. Each method has
226
+ parameters allowing you to specify the type of join to perform (`` LEFT ``, `` RIGHT ``, `` INNER ``,
227
+ `` FULL ``) or the columns to join on (column names or indices).
228
228
229
229
.. ipython :: python
230
230
@@ -233,7 +233,7 @@ columns to join on (column names or indices).
233
233
234
234
Assume we have two database tables of the same name and structure as our DataFrames.
235
235
236
- Now let's go over the various types of JOINs .
236
+ Now let's go over the various types of `` JOIN `` \s .
237
237
238
238
INNER JOIN
239
239
~~~~~~~~~~
@@ -259,56 +259,59 @@ column with another DataFrame's index.
259
259
260
260
LEFT OUTER JOIN
261
261
~~~~~~~~~~~~~~~
262
+
263
+ Show all records from ``df1 ``.
264
+
262
265
.. code-block :: sql
263
266
264
- -- show all records from df1
265
267
SELECT *
266
268
FROM df1
267
269
LEFT OUTER JOIN df2
268
270
ON df1.key = df2.key;
269
271
270
272
.. ipython :: python
271
273
272
- # show all records from df1
273
274
pd.merge(df1, df2, on = " key" , how = " left" )
274
275
275
276
RIGHT JOIN
276
277
~~~~~~~~~~
278
+
279
+ Show all records from ``df2 ``.
280
+
277
281
.. code-block :: sql
278
282
279
- -- show all records from df2
280
283
SELECT *
281
284
FROM df1
282
285
RIGHT OUTER JOIN df2
283
286
ON df1.key = df2.key;
284
287
285
288
.. ipython :: python
286
289
287
- # show all records from df2
288
290
pd.merge(df1, df2, on = " key" , how = " right" )
289
291
290
292
FULL JOIN
291
293
~~~~~~~~~
292
- pandas also allows for FULL JOINs, which display both sides of the dataset, whether or not the
293
- joined columns find a match. As of writing, FULL JOINs are not supported in all RDBMS (MySQL).
294
+ pandas also allows for ``FULL JOIN ``\s , which display both sides of the dataset, whether or not the
295
+ joined columns find a match. As of writing, ``FULL JOIN ``\s are not supported in all RDBMS (MySQL).
296
+
297
+ Show all records from both tables.
294
298
295
299
.. code-block :: sql
296
300
297
- -- show all records from both tables
298
301
SELECT *
299
302
FROM df1
300
303
FULL OUTER JOIN df2
301
304
ON df1.key = df2.key;
302
305
303
306
.. ipython :: python
304
307
305
- # show all records from both frames
306
308
pd.merge(df1, df2, on = " key" , how = " outer" )
307
309
308
310
309
311
UNION
310
312
-----
311
- UNION ALL can be performed using :meth: `~pandas.concat `.
313
+
314
+ ``UNION ALL `` can be performed using :meth: `~pandas.concat `.
312
315
313
316
.. ipython :: python
314
317
@@ -340,7 +343,7 @@ UNION ALL can be performed using :meth:`~pandas.concat`.
340
343
341
344
pd.concat([df1, df2])
342
345
343
- SQL's UNION is similar to UNION ALL, however UNION will remove duplicate rows.
346
+ SQL's `` UNION `` is similar to `` UNION ALL `` , however `` UNION `` will remove duplicate rows.
344
347
345
348
.. code-block :: sql
346
349
@@ -456,7 +459,7 @@ the same using ``rank(method='first')`` function
456
459
Let's find tips with (rank < 3) per gender group for (tips < 2).
457
460
Notice that when using ``rank(method='min') `` function
458
461
``rnk_min `` remains the same for the same ``tip ``
459
- (as Oracle's RANK() function)
462
+ (as Oracle's `` RANK() `` function)
460
463
461
464
.. ipython :: python
462
465
@@ -489,7 +492,7 @@ DELETE
489
492
DELETE FROM tips
490
493
WHERE tip > 9;
491
494
492
- In pandas we select the rows that should remain, instead of deleting them
495
+ In pandas we select the rows that should remain instead of deleting them:
493
496
494
497
.. ipython :: python
495
498
0 commit comments