@@ -196,7 +196,8 @@ Consider a typical CSV file containing, in this case, some time series data:
196
196
197
197
.. ipython :: python
198
198
199
- print (open (' foo.csv' ).read())
199
+ with open (' foo.csv' ) as fh:
200
+ print (fh.read())
200
201
201
202
The default for `read_csv ` is to create a DataFrame with simple numbered rows:
202
203
@@ -485,7 +486,8 @@ column names:
485
486
486
487
.. ipython :: python
487
488
488
- print (open (' tmp.csv' ).read())
489
+ with open (' tmp.csv' ) as fh:
490
+ print (fh.read())
489
491
df = pd.read_csv(' tmp.csv' , header = None , parse_dates = [[1 , 2 ], [1 , 3 ]])
490
492
df
491
493
@@ -666,7 +668,8 @@ DD/MM/YYYY instead. For convenience, a ``dayfirst`` keyword is provided:
666
668
667
669
.. ipython :: python
668
670
669
- print (open (' tmp.csv' ).read())
671
+ with open (' tmp.csv' ) as fh:
672
+ print (fh.read())
670
673
671
674
pd.read_csv(' tmp.csv' , parse_dates = [0 ])
672
675
pd.read_csv(' tmp.csv' , dayfirst = True , parse_dates = [0 ])
@@ -694,7 +697,8 @@ By default, numbers with a thousands separator will be parsed as strings
694
697
695
698
.. ipython :: python
696
699
697
- print (open (' tmp.csv' ).read())
700
+ with open (' tmp.csv' ) as fh:
701
+ print (fh.read())
698
702
df = pd.read_csv(' tmp.csv' , sep = ' |' )
699
703
df
700
704
@@ -704,7 +708,8 @@ The ``thousands`` keyword allows integers to be parsed correctly
704
708
705
709
.. ipython :: python
706
710
707
- print (open (' tmp.csv' ).read())
711
+ with open (open (' tmp.csv' ) as fh:
712
+ print (fh.read())
708
713
df = pd.read_csv(' tmp.csv' , sep = ' |' , thousands = ' ,' )
709
714
df
710
715
@@ -781,7 +786,8 @@ Sometimes comments or meta data may be included in a file:
781
786
782
787
.. ipython:: python
783
788
784
- print (open (' tmp.csv' ).read())
789
+ with open (' tmp.csv' ) as fh:
790
+ print (fh.read())
785
791
786
792
By default, the parse includes the comments in the output:
787
793
@@ -821,7 +827,8 @@ as a ``Series``:
821
827
822
828
.. ipython:: python
823
829
824
- print (open (' tmp.csv' ).read())
830
+ with open (' tmp.csv' ) as fh:
831
+ print (fh.read())
825
832
826
833
output = pd.read_csv(' tmp.csv' , squeeze = True )
827
834
output
@@ -933,7 +940,8 @@ Consider a typical fixed-width data file:
933
940
934
941
.. ipython:: python
935
942
936
- print (open (' bar.csv' ).read())
943
+ with open (' bar.csv' ) as fh:
944
+ print (fh.read())
937
945
938
946
In order to parse this file into a DataFrame, we simply need to supply the
939
947
column specifications to the `read_fwf` function along with the file name:
@@ -991,7 +999,8 @@ column:
991
999
992
1000
.. ipython:: python
993
1001
994
- print (open (' foo.csv' ).read())
1002
+ with open (' foo.csv' ) as fh:
1003
+ print (fh.read())
995
1004
996
1005
In this special case, `` read_csv`` assumes that the first column is to be used
997
1006
as the index of the DataFrame:
@@ -1023,7 +1032,8 @@ Suppose you have data indexed by two columns:
1023
1032
1024
1033
.. ipython:: python
1025
1034
1026
- print (open (' data/mindex_ex.csv' ).read())
1035
+ with open (' data/mindex_ex.csv' ) as fh:
1036
+ print (fh.read())
1027
1037
1028
1038
The `` index_col`` argument to `` read_csv`` and `` read_table`` can take a list of
1029
1039
column numbers to turn multiple columns into a `` MultiIndex`` for the index of the
@@ -1050,7 +1060,8 @@ of tupleizing columns, specify ``tupleize_cols=True``.
1050
1060
from pandas.util.testing import makeCustomDataframe as mkdf
1051
1061
df = mkdf(5 ,3 ,r_idx_nlevels = 2 ,c_idx_nlevels = 4 )
1052
1062
df.to_csv(' mi.csv' )
1053
- print (open (' mi.csv' ).read())
1063
+ with open (' mi.csv' ) as fh:
1064
+ print (fh.read())
1054
1065
pd.read_csv(' mi.csv' ,header = [0 ,1 ,2 ,3 ],index_col = [0 ,1 ])
1055
1066
1056
1067
Starting in 0.13 .0, `` read_csv`` will be able to interpret a more common format
@@ -1066,7 +1077,8 @@ of multi-columns indices.
1066
1077
1067
1078
.. ipython:: python
1068
1079
1069
- print (open (' mi2.csv' ).read())
1080
+ with open (' mi2.csv' ) as fh:
1081
+ print (fh.read())
1070
1082
pd.read_csv(' mi2.csv' ,header = [0 ,1 ],index_col = 0 )
1071
1083
1072
1084
Note: If an `` index_col`` is not specified (e.g. you don' t have an index, or wrote it
@@ -1097,8 +1109,9 @@ class of the csv module. For this, you have to specify ``sep=None``.
1097
1109
1098
1110
.. ipython:: python
1099
1111
1100
- print (open (' tmp2.sv' ).read())
1101
- pd.read_csv(' tmp2.sv' , sep = None )
1112
+ with open (' tmp2.sv' ) as fh:
1113
+ print (fh.read())
1114
+ pd.read_csv(' tmp2.sv' , sep = None , engine = ' python' )
1102
1115
1103
1116
.. _io.chunking:
1104
1117
@@ -1111,7 +1124,8 @@ rather than reading the entire file into memory, such as the following:
1111
1124
1112
1125
.. ipython:: python
1113
1126
1114
- print (open (' tmp.sv' ).read())
1127
+ with open (' tmp.sv' ) as fh:
1128
+ print (fh.read())
1115
1129
table = pd.read_table(' tmp.sv' , sep = ' |' )
1116
1130
table
1117
1131
@@ -1127,7 +1141,6 @@ value will be an iterable object of type ``TextFileReader``:
1127
1141
for chunk in reader:
1128
1142
print (chunk)
1129
1143
1130
-
1131
1144
Specifying `` iterator = True `` will also return the `` TextFileReader`` object :
1132
1145
1133
1146
.. ipython:: python
@@ -1138,6 +1151,7 @@ Specifying ``iterator=True`` will also return the ``TextFileReader`` object:
1138
1151
.. ipython:: python
1139
1152
:suppress:
1140
1153
1154
+ reader = None
1141
1155
os.remove(' tmp.sv' )
1142
1156
os.remove(' tmp2.sv' )
1143
1157
0 commit comments