@@ -120,6 +120,11 @@ The default for `read_csv` is to create a DataFrame with simple numbered rows:
120
120
In the case of indexed data, you can pass the column number (or a list of
121
121
column numbers, for a hierarchical index) you wish to use as the index.
122
122
123
+ The parsers make every attempt to "do the right thing" and not be very
124
+ fragile. Type inference is a pretty big deal. So if a column can be coerced to
125
+ integer dtype without altering the contents, it will do so. Any non-numeric
126
+ columns will come through as object dtype as with the rest of pandas objects.
127
+
123
128
.. _io.parse_dates :
124
129
125
130
To better facilitate working with datetime data, :func: `~pandas.io.parsers.read_csv ` and :func: `~pandas.io.parsers.read_table `
@@ -142,35 +147,68 @@ The simplest case is to just pass in ``parse_dates=True``:
142
147
143
148
os.remove(' foo.csv' )
144
149
145
- You can specify a custom ``date_parser `` function:
150
+ It is often the case that we may want to store date and time data separately,
151
+ or store various date fields separately. the ``parse_dates `` keyword can be
152
+ used to specify a combination of columns to parse the dates and/or times from.
153
+
154
+ You can specify a list of column lists to ``parse_dates ``, the resulting date
155
+ columns will be prepended to the output and the new column names will be the
156
+ concatenation of the component column names:
146
157
147
158
.. ipython :: python
148
159
:suppress:
149
- # data = """
160
+
161
+ data = (" KORD,19990127, 19:00:00, 18:56:00, 0.8100\n "
162
+ " KORD,19990127, 20:00:00, 19:56:00, 0.0100\n "
163
+ " KORD,19990127, 21:00:00, 20:56:00, -0.5900\n "
164
+ " KORD,19990127, 21:00:00, 21:18:00, -0.9900\n "
165
+ " KORD,19990127, 22:00:00, 21:56:00, -0.5900\n "
166
+ " KORD,19990127, 23:00:00, 22:56:00, -0.5900" )
167
+
150
168
with open (' tmp.csv' , ' w' ) as fh:
151
169
fh.write(data)
152
170
153
171
.. ipython :: python
154
172
155
- # read it in
173
+ print open (' tmp.csv' ).read()
174
+ df = read_csv(' tmp.csv' , header = None , parse_dates = [[1 , 2 ], [1 , 3 ]])
175
+ df
176
+
177
+ By default the parser removes the component date columns, but you can choose
178
+ to retain them via the ``keep_date_col `` keyword:
156
179
157
180
.. ipython :: python
158
- :suppress:
159
- os.remove(' tmp.csv' )
160
181
161
- It is often the case that we may want to store date and time data separately ,
162
- or store various date fields separately. the `` parse_dates `` keyword can be
163
- used to specify a combination of columns to parse the dates and/or times from.
182
+ df = read_csv( ' tmp.csv ' , header = None , parse_dates = [[ 1 , 2 ], [ 1 , 3 ]] ,
183
+ keep_date_col = True )
184
+ df
164
185
165
- You can specify a list of column lists to ``parse_dates ``, the resulting date
166
- columns will be prepended to the output and the new column names will be the
167
- component column names
186
+ You can also use a dict to specify custom name columns:
168
187
188
+ .. ipython :: python
169
189
170
- The parsers make every attempt to "do the right thing" and not be very
171
- fragile. Type inference is a pretty big deal. So if a column can be coerced to
172
- integer dtype without altering the contents, it will do so. Any non-numeric
173
- columns will come through as object dtype as with the rest of pandas objects.
190
+ date_spec = {' nominal' : [1 , 2 ], ' actual' : [1 , 3 ]}
191
+ df = read_csv(' tmp.csv' , header = None , parse_dates = date_spec)
192
+ df
193
+
194
+ Finally, the parser allows you can specify a custom ``date_parser `` function to
195
+ take full advantage of the flexiblity of the date parsing API:
196
+
197
+ .. ipython :: python
198
+
199
+ import pandas.io.date_converters as conv
200
+ df = read_cvs(' tmp.csv' , header = None , parse_dates = date_spec,
201
+ date_parser = conv.parse_date_time)
202
+ df
203
+
204
+ You can explore the date parsing functionality in ``date_converters.py `` and
205
+ add your own. We would love to turn this module into a community supported set
206
+ of date/time parsers.
207
+
208
+ .. ipython :: python
209
+ :suppress:
210
+
211
+ os.remove(' tmp.csv' )
174
212
175
213
.. _io.fwf :
176
214
0 commit comments