ENH: to_csv() date formatting #4313

qwhelan · 2013-07-22T05:09:16Z

This commit adds support for formatting datetime object output from to_csv()
closes #2583

In [3]: spx = DataReader('^GSPC', data_source='yahoo')

In [4]: spx.head()
Out[4]: 
               Open     High      Low    Close      Volume  Adj Close
Date                                                                 
2010-01-04  1116.56  1133.87  1116.56  1132.99  3991400000    1132.99
2010-01-05  1132.66  1136.63  1129.66  1136.52  2491020000    1136.52
2010-01-06  1135.71  1139.19  1133.95  1137.14  4972660000    1137.14
2010-01-07  1136.27  1142.46  1131.32  1141.69  5270680000    1141.69
2010-01-08  1140.52  1145.39  1136.22  1144.98  4389590000    1144.98

In [5]: spx.to_csv('spx_temp.csv', date_format='%Y%m%d')

In [6]: !head spx_temp.csv
Date,Open,High,Low,Close,Volume,Adj Close
20100104,1116.56,1133.87,1116.56,1132.99,3991400000,1132.99
20100105,1132.66,1136.63,1129.66,1136.52,2491020000,1136.52
20100106,1135.71,1139.19,1133.95,1137.14,4972660000,1137.14
20100107,1136.27,1142.46,1131.32,1141.69,5270680000,1141.69
20100108,1140.52,1145.39,1136.22,1144.98,4389590000,1144.98
20100111,1145.96,1149.74,1142.02,1146.98,4255780000,1146.98
20100112,1143.81,1143.81,1131.77,1136.22,4716160000,1136.22
20100113,1137.31,1148.4,1133.18,1145.68,4170360000,1145.68
20100114,1145.68,1150.41,1143.8,1148.46,3915200000,1148.46

The date_format= keyword will be applied to every element of a DatetimeIndex (index or columns) and DatetimeBlock (values). It works for both the Python engine and the new Cython engine:

In [7]: datetimes = DataFrame({spx.index[0]: spx.index}, index=spx.index).head()

In [8]: datetimes
Out[8]: 
                    2010-01-04
Date                          
2010-01-04 2010-01-04 00:00:00
2010-01-05 2010-01-05 00:00:00
2010-01-06 2010-01-06 00:00:00
2010-01-07 2010-01-07 00:00:00
2010-01-08 2010-01-08 00:00:00

In [9]: datetimes.to_csv('datetimes_temp.csv', date_format='%m/%d/%Y')

In [10]: !head datetimes_temp.csv
Date,01/04/2010
01/04/2010,01/04/2010
01/05/2010,01/05/2010
01/06/2010,01/06/2010
01/07/2010,01/07/2010
01/08/2010,01/08/2010

In [11]: datetimes.to_csv('datetimes_temp.csv', date_format='%m/%d/%Y', engine='python')

In [12]: !head datetimes_temp.csvDate,01/04/2010
01/04/2010,01/04/2010
01/05/2010,01/05/2010
01/06/2010,01/06/2010
01/07/2010,01/07/2010
01/08/2010,01/08/2010

Let me know if there are any questions or issues.

qwhelan · 2013-07-22T06:20:03Z

@jreback @cpcloud Any chance this could make it in before the v0.12 release?

Also, I forgot to mention above that this doesn't handle MultiIndexes. I don't see a clean way to do it, but I can revisit it if someone requests it.

cpcloud · 2013-07-22T06:28:31Z

I think only bug fixes for now. Really trying to get v0.12 out ASAP!

qwhelan · 2013-07-22T06:29:59Z

Alright, just wanted to check.

Thanks.

cpcloud · 2013-07-22T06:33:29Z

Btw Thanks for the Pr!

jreback · 2013-07-22T10:39:29Z

@qwhelan I think you need to either raise or warn if the stringified datetimes are not == to the current

otherwise its pretty easy to chop say datetimes to dates (which is fine, except it should be done explicity by resetting the index, rather than a typo/incorrect format)

# should be in core/index.py
def hastimes(self):
    return not (set(self.times) == set([time(0,0)]))

in core/internals/DatetimeBlock
if.date_format is not None:

   # values might be a series here (e.g. a column from a frame)
   # need to convert to an index to test this
   if values.hastimes():

        # test converted == values, maybe by sampling
        # or can do:   (values == pd.to_datetime(converted)).all()

        # if not == raise/warn

qwhelan · 2013-07-24T04:32:53Z

@jreback I don't see hastimes() or anything with the same functionality. I'll add unless there's another function I should be using (and probably put it in tseries/tindex.py).

jreback · 2013-07-24T04:34:37Z

yep have to add it (that's why I put it out there!)

jreback · 2013-08-23T02:21:14Z

@qwhelan can you rebase to current?

this looks pretty good otherwise

jreback · 2013-09-24T01:16:42Z

@qwhelan this somehow got lost....can you rebase to current master.....can get this in for 0.13...thxs

jreback · 2013-09-24T01:17:34Z

also...need to make sure this handles NaT as well (e.g. use in your sample tests)

jtratner · 2013-09-24T01:48:48Z

perf test?

qwhelan · 2013-09-24T02:22:33Z

Sorry for neglecting this. I'll make the changes later this week.

qwhelan · 2013-10-04T04:53:03Z

Added a perf test and tested/handled NaTs.

I was caught in rebase hell for the last week, but it should be ready to go unless there are additional requests.

jreback · 2013-10-04T11:32:53Z

move example to 0.13.0 (from 0.12)

jreback · 2013-10-04T12:07:49Z

pandas/core/internals.py

-        rvalues.flat[imask] = np.array(
-            [Timestamp(val)._repr_base for val in values.ravel()[imask]], dtype=object)
+
+        if self.dtype == 'datetime64[ns]':


this changed in master, (as the timedelta stuff was moved); pls start with the current and don't add (which prob happened in a rebase)

jreback · 2013-10-07T21:27:41Z

@qwhelan almost there....just need those 2 changes!

qwhelan · 2013-10-08T05:31:06Z

@jreback Just pushed the changes. Let me know if you're looking for something different.

jreback · 2013-10-08T10:10:26Z

@qwhelan can you move the v0.12.0 announcement to v0.13.0?
change the test that uses NaT to include some dates as well as NaT's

qwhelan · 2013-10-09T04:03:12Z

@jreback Made those changes.

jreback · 2013-10-09T12:04:32Z

pandas/core/format.py

@@ -1001,7 +1017,15 @@ def _helper_csv(self, writer, na_rep=None, cols=None,
                if float_format is not None and com.is_float(val):
                    val = float_format % val
                elif isinstance(val, np.datetime64):
-                    val = lib.Timestamp(val)._repr_base


this can all be collapsed down, something like:

if date_format is None: date_formatter = lambda x: x._repr_base else date_formatter = lambda x: x.strftime(date_format) if notnull(x) if isinstance(val, (np.datetime64, datetime.datetime)): val = date_formatter(lib.Timestamp(val))

jreback · 2013-10-11T12:34:28Z

@qwhelan can you update to my comments....?

DOC: add date_format to release notes

qwhelan · 2013-10-12T05:24:03Z

@jreback All done. Thanks for the suggestions.

jreback · 2013-10-12T14:13:17Z

thanks...merged

cpcloud mentioned this pull request Aug 26, 2013

custom formatters for to_csv #4668

Closed

5 tasks

jreback reviewed Oct 4, 2013
View reviewed changes

jreback reviewed Oct 9, 2013
View reviewed changes

ENH: Add date_format keyword to to_csv()

ce669d6

DOC: add date_format to release notes

jreback merged commit ce669d6 into pandas-dev:master Oct 12, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: to_csv() date formatting #4313

ENH: to_csv() date formatting #4313

qwhelan commented Jul 22, 2013

qwhelan commented Jul 22, 2013

cpcloud commented Jul 22, 2013

qwhelan commented Jul 22, 2013

cpcloud commented Jul 22, 2013

jreback commented Jul 22, 2013

qwhelan commented Jul 24, 2013

jreback commented Jul 24, 2013

jreback commented Aug 23, 2013

jreback commented Sep 24, 2013

jreback commented Sep 24, 2013

jtratner commented Sep 24, 2013

qwhelan commented Sep 24, 2013

qwhelan commented Oct 4, 2013

jreback commented Oct 4, 2013

jreback Oct 4, 2013

jreback commented Oct 7, 2013

qwhelan commented Oct 8, 2013

jreback commented Oct 8, 2013

qwhelan commented Oct 9, 2013

jreback Oct 9, 2013

jreback commented Oct 11, 2013

qwhelan commented Oct 12, 2013

jreback commented Oct 12, 2013

ENH: to_csv() date formatting #4313

ENH: to_csv() date formatting #4313

Conversation

qwhelan commented Jul 22, 2013

qwhelan commented Jul 22, 2013

cpcloud commented Jul 22, 2013

qwhelan commented Jul 22, 2013

cpcloud commented Jul 22, 2013

jreback commented Jul 22, 2013

qwhelan commented Jul 24, 2013

jreback commented Jul 24, 2013

jreback commented Aug 23, 2013

jreback commented Sep 24, 2013

jreback commented Sep 24, 2013

jtratner commented Sep 24, 2013

qwhelan commented Sep 24, 2013

qwhelan commented Oct 4, 2013

jreback commented Oct 4, 2013

jreback Oct 4, 2013

Choose a reason for hiding this comment

jreback commented Oct 7, 2013

qwhelan commented Oct 8, 2013

jreback commented Oct 8, 2013

qwhelan commented Oct 9, 2013

jreback Oct 9, 2013

Choose a reason for hiding this comment

jreback commented Oct 11, 2013

qwhelan commented Oct 12, 2013

jreback commented Oct 12, 2013