@@ -357,95 +357,6 @@ warn_bad_lines : boolean, default ``True``
357
357
If error_bad_lines is ``False ``, and warn_bad_lines is ``True ``, a warning for
358
358
each "bad line" will be output (only valid with C parser).
359
359
360
- .. ipython :: python
361
- :suppress:
362
-
363
- f = open (' foo.csv' ,' w' )
364
- f.write(' date,A,B,C\n 20090101,a,1,2\n 20090102,b,3,4\n 20090103,c,4,5' )
365
- f.close()
366
-
367
- Consider a typical CSV file containing, in this case, some time series data:
368
-
369
- .. ipython :: python
370
-
371
- print (open (' foo.csv' ).read())
372
-
373
- The default for `read_csv ` is to create a DataFrame with simple numbered rows:
374
-
375
- .. ipython :: python
376
-
377
- pd.read_csv(' foo.csv' )
378
-
379
- In the case of indexed data, you can pass the column number or column name you
380
- wish to use as the index:
381
-
382
- .. ipython :: python
383
-
384
- pd.read_csv(' foo.csv' , index_col = 0 )
385
-
386
- .. ipython :: python
387
-
388
- pd.read_csv(' foo.csv' , index_col = ' date' )
389
-
390
- You can also use a list of columns to create a hierarchical index:
391
-
392
- .. ipython :: python
393
-
394
- pd.read_csv(' foo.csv' , index_col = [0 , ' A' ])
395
-
396
- .. _io.dialect :
397
-
398
- The ``dialect `` keyword gives greater flexibility in specifying the file format.
399
- By default it uses the Excel dialect but you can specify either the dialect name
400
- or a :class: `python:csv.Dialect ` instance.
401
-
402
- .. ipython :: python
403
- :suppress:
404
-
405
- data = (' label1,label2,label3\n '
406
- ' index1,"a,c,e\n '
407
- ' index2,b,d,f' )
408
-
409
- Suppose you had data with unenclosed quotes:
410
-
411
- .. ipython :: python
412
-
413
- print (data)
414
-
415
- By default, ``read_csv `` uses the Excel dialect and treats the double quote as
416
- the quote character, which causes it to fail when it finds a newline before it
417
- finds the closing double quote.
418
-
419
- We can get around this using ``dialect ``
420
-
421
- .. ipython :: python
422
- :okwarning:
423
-
424
- dia = csv.excel()
425
- dia.quoting = csv.QUOTE_NONE
426
- pd.read_csv(StringIO(data), dialect = dia)
427
-
428
- All of the dialect options can be specified separately by keyword arguments:
429
-
430
- .. ipython :: python
431
-
432
- data = ' a,b,c~1,2,3~4,5,6'
433
- pd.read_csv(StringIO(data), lineterminator = ' ~' )
434
-
435
- Another common dialect option is ``skipinitialspace ``, to skip any whitespace
436
- after a delimiter:
437
-
438
- .. ipython :: python
439
-
440
- data = ' a, b, c\n 1, 2, 3\n 4, 5, 6'
441
- print (data)
442
- pd.read_csv(StringIO(data), skipinitialspace = True )
443
-
444
- The parsers make every attempt to "do the right thing" and not be very
445
- fragile. Type inference is a pretty big deal. So if a column can be coerced to
446
- integer dtype without altering the contents, it will do so. Any non-numeric
447
- columns will come through as object dtype as with the rest of pandas objects.
448
-
449
360
.. _io.dtypes :
450
361
451
362
Specifying column data types
@@ -1239,6 +1150,62 @@ data that appear in some lines but not others:
1239
1150
1 4 5 6
1240
1151
2 8 9 10
1241
1152
1153
+ .. _io.dialect :
1154
+
1155
+ Dialect
1156
+ '''''''
1157
+
1158
+ The ``dialect `` keyword gives greater flexibility in specifying the file format.
1159
+ By default it uses the Excel dialect but you can specify either the dialect name
1160
+ or a :class: `python:csv.Dialect ` instance.
1161
+
1162
+ .. ipython :: python
1163
+ :suppress:
1164
+
1165
+ data = (' label1,label2,label3\n '
1166
+ ' index1,"a,c,e\n '
1167
+ ' index2,b,d,f' )
1168
+
1169
+ Suppose you had data with unenclosed quotes:
1170
+
1171
+ .. ipython :: python
1172
+
1173
+ print (data)
1174
+
1175
+ By default, ``read_csv `` uses the Excel dialect and treats the double quote as
1176
+ the quote character, which causes it to fail when it finds a newline before it
1177
+ finds the closing double quote.
1178
+
1179
+ We can get around this using ``dialect ``
1180
+
1181
+ .. ipython :: python
1182
+ :okwarning:
1183
+
1184
+ dia = csv.excel()
1185
+ dia.quoting = csv.QUOTE_NONE
1186
+ pd.read_csv(StringIO(data), dialect = dia)
1187
+
1188
+ All of the dialect options can be specified separately by keyword arguments:
1189
+
1190
+ .. ipython :: python
1191
+
1192
+ data = ' a,b,c~1,2,3~4,5,6'
1193
+ pd.read_csv(StringIO(data), lineterminator = ' ~' )
1194
+
1195
+ Another common dialect option is ``skipinitialspace ``, to skip any whitespace
1196
+ after a delimiter:
1197
+
1198
+ .. ipython :: python
1199
+
1200
+ data = ' a, b, c\n 1, 2, 3\n 4, 5, 6'
1201
+ print (data)
1202
+ pd.read_csv(StringIO(data), skipinitialspace = True )
1203
+
1204
+ The parsers make every attempt to "do the right thing" and not be very
1205
+ fragile. Type inference is a pretty big deal. So if a column can be coerced to
1206
+ integer dtype without altering the contents, it will do so. Any non-numeric
1207
+ columns will come through as object dtype as with the rest of pandas objects.
1208
+
1242
1209
.. _io.quoting :
1243
1210
1244
1211
Quoting and Escape Characters
0 commit comments