Skip to content

Commit db69ce0

Browse files
Harmonize column selection to bracket notation
As suggested by https://medium.com/dunder-data/minimally-sufficient-pandas-a8e67f2a2428#46f9
1 parent 01babb5 commit db69ce0

File tree

1 file changed

+14
-14
lines changed

1 file changed

+14
-14
lines changed

doc/source/user_guide/indexing.rst

+14-14
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ new column. In 0.21.0 and later, this will raise a ``UserWarning``:
236236
.. code-block:: ipython
237237
238238
In [1]: df = pd.DataFrame({'one': [1., 2., 3.]})
239-
In [2]: df.two = [4, 5, 6]
239+
In [2]: df['two'] = [4, 5, 6]
240240
UserWarning: Pandas doesn't allow Series to be assigned into nonexistent columns - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute_access
241241
In [3]: df
242242
Out[3]:
@@ -540,7 +540,7 @@ The ``callable`` must be a function with one argument (the calling Series or Dat
540540
columns=list('ABCD'))
541541
df1
542542
543-
df1.loc[lambda df: df.A > 0, :]
543+
df1.loc[lambda df: df['A'] > 0, :]
544544
df1.loc[:, lambda df: ['A', 'B']]
545545
546546
df1.iloc[:, lambda df: [0, 1]]
@@ -561,7 +561,7 @@ without using a temporary variable.
561561
562562
bb = pd.read_csv('data/baseball.csv', index_col='id')
563563
(bb.groupby(['year', 'team']).sum()
564-
.loc[lambda df: df.r > 100])
564+
.loc[lambda df: df['r'] > 100])
565565
566566
.. _indexing.deprecate_ix:
567567

@@ -871,9 +871,9 @@ Boolean indexing
871871
Another common operation is the use of boolean vectors to filter the data.
872872
The operators are: ``|`` for ``or``, ``&`` for ``and``, and ``~`` for ``not``.
873873
These **must** be grouped by using parentheses, since by default Python will
874-
evaluate an expression such as ``df.A > 2 & df.B < 3`` as
875-
``df.A > (2 & df.B) < 3``, while the desired evaluation order is
876-
``(df.A > 2) & (df.B < 3)``.
874+
evaluate an expression such as ``df['A'] > 2 & df['B'] < 3`` as
875+
``df['A'] > (2 & df['B']) < 3``, while the desired evaluation order is
876+
``(df['A > 2) & (df['B'] < 3)``.
877877

878878
Using a boolean vector to index a Series works exactly as in a NumPy ndarray:
879879

@@ -1134,7 +1134,7 @@ between the values of columns ``a`` and ``c``. For example:
11341134
df
11351135
11361136
# pure python
1137-
df[(df.a < df.b) & (df.b < df.c)]
1137+
df[(df['a'] < df['b']) & (df['b'] < df['c'])]
11381138
11391139
# query
11401140
df.query('(a < b) & (b < c)')
@@ -1241,7 +1241,7 @@ Full numpy-like syntax:
12411241
df = pd.DataFrame(np.random.randint(n, size=(n, 3)), columns=list('abc'))
12421242
df
12431243
df.query('(a < b) & (b < c)')
1244-
df[(df.a < df.b) & (df.b < df.c)]
1244+
df[(df['a'] < df['b']) & (df['b'] < df['c'])]
12451245
12461246
Slightly nicer by removing the parentheses (by binding making comparison
12471247
operators bind tighter than ``&`` and ``|``).
@@ -1279,12 +1279,12 @@ The ``in`` and ``not in`` operators
12791279
df.query('a in b')
12801280
12811281
# How you'd do it in pure Python
1282-
df[df.a.isin(df.b)]
1282+
df[df['a'].isin(df['b'])]
12831283
12841284
df.query('a not in b')
12851285
12861286
# pure Python
1287-
df[~df.a.isin(df.b)]
1287+
df[~df['a'].isin(df['b'])]
12881288
12891289
12901290
You can combine this with other expressions for very succinct queries:
@@ -1297,7 +1297,7 @@ You can combine this with other expressions for very succinct queries:
12971297
df.query('a in b and c < d')
12981298
12991299
# pure Python
1300-
df[df.b.isin(df.a) & (df.c < df.d)]
1300+
df[df['b'].isin(df['a']) & (df['c'] < df['d'])]
13011301
13021302
13031303
.. note::
@@ -1326,7 +1326,7 @@ to ``in``/``not in``.
13261326
df.query('b == ["a", "b", "c"]')
13271327
13281328
# pure Python
1329-
df[df.b.isin(["a", "b", "c"])]
1329+
df[df['b'].isin(["a", "b", "c"])]
13301330
13311331
df.query('c == [1, 2]')
13321332
@@ -1338,7 +1338,7 @@ to ``in``/``not in``.
13381338
df.query('[1, 2] not in c')
13391339
13401340
# pure Python
1341-
df[df.c.isin([1, 2])]
1341+
df[df['c'].isin([1, 2])]
13421342
13431343
13441344
Boolean operators
@@ -1362,7 +1362,7 @@ Of course, expressions can be arbitrarily complex too:
13621362
shorter = df.query('a < b < c and (not bools) or bools > 2')
13631363
13641364
# equivalent in pure Python
1365-
longer = df[(df.a < df.b) & (df.b < df.c) & (~df.bools) | (df.bools > 2)]
1365+
longer = df[(df['a'] < df['b']) & (df['b'] < df['c']) & (~df.bools) | (df.bools > 2)]
13661366
13671367
shorter
13681368
longer

0 commit comments

Comments
 (0)