@@ -210,7 +210,7 @@ as an attribute:
210
210
See `here for an explanation of valid identifiers
211
211
<https://docs.python.org/3/reference/lexical_analysis.html#identifiers> `__.
212
212
213
- - The attribute will not be available if it conflicts with an existing method name, e.g. ``s.min `` is not allowed.
213
+ - The attribute will not be available if it conflicts with an existing method name, e.g. ``s.min `` is not allowed, but `` s['min'] `` is possible .
214
214
215
215
- Similarly, the attribute will not be available if it conflicts with any of the following list: ``index ``,
216
216
``major_axis ``, ``minor_axis ``, ``items ``.
@@ -540,7 +540,7 @@ The ``callable`` must be a function with one argument (the calling Series or Dat
540
540
columns = list (' ABCD' ))
541
541
df1
542
542
543
- df1.loc[lambda df : df.A > 0 , :]
543
+ df1.loc[lambda df : df[ ' A ' ] > 0 , :]
544
544
df1.loc[:, lambda df : [' A' , ' B' ]]
545
545
546
546
df1.iloc[:, lambda df : [0 , 1 ]]
@@ -552,7 +552,7 @@ You can use callable indexing in ``Series``.
552
552
553
553
.. ipython :: python
554
554
555
- df1.A .loc[lambda s : s > 0 ]
555
+ df1[ ' A ' ] .loc[lambda s : s > 0 ]
556
556
557
557
Using these methods / indexers, you can chain data selection operations
558
558
without using a temporary variable.
@@ -561,7 +561,7 @@ without using a temporary variable.
561
561
562
562
bb = pd.read_csv(' data/baseball.csv' , index_col = ' id' )
563
563
(bb.groupby([' year' , ' team' ]).sum()
564
- .loc[lambda df : df.r > 100 ])
564
+ .loc[lambda df : df[ ' r ' ] > 100 ])
565
565
566
566
.. _indexing.deprecate_ix :
567
567
@@ -871,9 +871,9 @@ Boolean indexing
871
871
Another common operation is the use of boolean vectors to filter the data.
872
872
The operators are: ``| `` for ``or ``, ``& `` for ``and ``, and ``~ `` for ``not ``.
873
873
These **must ** be grouped by using parentheses, since by default Python will
874
- evaluate an expression such as ``df.A > 2 & df.B < 3 `` as
875
- ``df.A > (2 & df.B ) < 3 ``, while the desired evaluation order is
876
- ``(df. A > 2) & (df.B < 3) ``.
874
+ evaluate an expression such as ``df['A'] > 2 & df['B'] < 3 `` as
875
+ ``df['A'] > (2 & df['B'] ) < 3 ``, while the desired evaluation order is
876
+ ``(df[' A > 2) & (df['B'] < 3) ``.
877
877
878
878
Using a boolean vector to index a Series works exactly as in a NumPy ndarray:
879
879
@@ -1134,7 +1134,7 @@ between the values of columns ``a`` and ``c``. For example:
1134
1134
df
1135
1135
1136
1136
# pure python
1137
- df[(df.a < df.b ) & (df.b < df.c )]
1137
+ df[(df[ ' a ' ] < df[ ' b ' ] ) & (df[ ' b ' ] < df[ ' c ' ] )]
1138
1138
1139
1139
# query
1140
1140
df.query(' (a < b) & (b < c)' )
@@ -1241,7 +1241,7 @@ Full numpy-like syntax:
1241
1241
df = pd.DataFrame(np.random.randint(n, size = (n, 3 )), columns = list (' abc' ))
1242
1242
df
1243
1243
df.query(' (a < b) & (b < c)' )
1244
- df[(df.a < df.b ) & (df.b < df.c )]
1244
+ df[(df[ ' a ' ] < df[ ' b ' ] ) & (df[ ' b ' ] < df[ ' c ' ] )]
1245
1245
1246
1246
Slightly nicer by removing the parentheses (by binding making comparison
1247
1247
operators bind tighter than ``& `` and ``| ``).
@@ -1279,12 +1279,12 @@ The ``in`` and ``not in`` operators
1279
1279
df.query(' a in b' )
1280
1280
1281
1281
# How you'd do it in pure Python
1282
- df[df.a .isin(df.b )]
1282
+ df[df[ ' a ' ] .isin(df[ ' b ' ] )]
1283
1283
1284
1284
df.query(' a not in b' )
1285
1285
1286
1286
# pure Python
1287
- df[~ df.a .isin(df.b )]
1287
+ df[~ df[ ' a ' ] .isin(df[ ' b ' ] )]
1288
1288
1289
1289
1290
1290
You can combine this with other expressions for very succinct queries:
@@ -1297,7 +1297,7 @@ You can combine this with other expressions for very succinct queries:
1297
1297
df.query(' a in b and c < d' )
1298
1298
1299
1299
# pure Python
1300
- df[df.b .isin(df.a ) & (df.c < df.d )]
1300
+ df[df[ ' b ' ] .isin(df[ ' a ' ] ) & (df[ ' c ' ] < df[ ' d ' ] )]
1301
1301
1302
1302
1303
1303
.. note ::
@@ -1326,7 +1326,7 @@ to ``in``/``not in``.
1326
1326
df.query(' b == ["a", "b", "c"]' )
1327
1327
1328
1328
# pure Python
1329
- df[df.b .isin([" a" , " b" , " c" ])]
1329
+ df[df[ ' b ' ] .isin([" a" , " b" , " c" ])]
1330
1330
1331
1331
df.query(' c == [1, 2]' )
1332
1332
@@ -1338,7 +1338,7 @@ to ``in``/``not in``.
1338
1338
df.query(' [1, 2] not in c' )
1339
1339
1340
1340
# pure Python
1341
- df[df.c .isin([1 , 2 ])]
1341
+ df[df[ ' c ' ] .isin([1 , 2 ])]
1342
1342
1343
1343
1344
1344
Boolean operators
@@ -1352,7 +1352,7 @@ You can negate boolean expressions with the word ``not`` or the ``~`` operator.
1352
1352
df[' bools' ] = np.random.rand(len (df)) > 0.5
1353
1353
df.query(' ~bools' )
1354
1354
df.query(' not bools' )
1355
- df.query(' not bools' ) == df[~ df. bools]
1355
+ df.query(' not bools' ) == df[~ df[ ' bools' ] ]
1356
1356
1357
1357
Of course, expressions can be arbitrarily complex too:
1358
1358
@@ -1362,7 +1362,10 @@ Of course, expressions can be arbitrarily complex too:
1362
1362
shorter = df.query(' a < b < c and (not bools) or bools > 2' )
1363
1363
1364
1364
# equivalent in pure Python
1365
- longer = df[(df.a < df.b) & (df.b < df.c) & (~ df.bools) | (df.bools > 2 )]
1365
+ longer = df[(df[' a' ] < df[' b' ])
1366
+ & (df[' b' ] < df[' c' ])
1367
+ & (~ df[' bools' ])
1368
+ | (df[' bools' ] > 2 )]
1366
1369
1367
1370
shorter
1368
1371
longer
@@ -1835,14 +1838,14 @@ chained indexing expression, you can set the :ref:`option <options>`
1835
1838
1836
1839
# This will show the SettingWithCopyWarning
1837
1840
# but the frame values will be set
1838
- dfb[' c' ][dfb.a .str.startswith(' o' )] = 42
1841
+ dfb[' c' ][dfb[ ' a ' ] .str.startswith(' o' )] = 42
1839
1842
1840
1843
This however is operating on a copy and will not work.
1841
1844
1842
1845
::
1843
1846
1844
1847
>>> pd.set_option('mode.chained_assignment','warn')
1845
- >>> dfb[dfb.a .str.startswith('o')]['c'] = 42
1848
+ >>> dfb[dfb['a'] .str.startswith('o')]['c'] = 42
1846
1849
Traceback (most recent call last)
1847
1850
...
1848
1851
SettingWithCopyWarning:
0 commit comments