Skip to content

API/ENH: stata export should better support np.datetime64 based times #4558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wuan opened this issue Aug 13, 2013 · 9 comments · Fixed by #6622
Closed

API/ENH: stata export should better support np.datetime64 based times #4558

wuan opened this issue Aug 13, 2013 · 9 comments · Fixed by #6622
Labels
API Design Dtype Conversions Unexpected or buggy dtype conversions IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@wuan
Copy link
Contributor

wuan commented Aug 13, 2013

The following code exporting a dataframe with a timeseries index with np.datetime64 type

import numpy as np
import pandas as pd

timestamps = pd.date_range(np.datetime64('2013-01-01 11:00:00.123456'), periods=5, freq='123U')
dataframe = pd.DataFrame(np.random.randn(len(timestamps)), index=timestamps)
dataframe.to_stata('out')

throws a

ValueError: Data type datetime64[ns] not currently understood. Please report an error to the developers.

It would be very nice if np.datetime64 (and thus pd.Timestamp as well) would be supported here.

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

Here's the example from the tests
I think you have to specify a datetime format as an argument when writing
(and this is the only test, so maybe there are some bugs)

In [44]: original = DataFrame(data=[["string", "object", 1, 1.1,np.datetime64('2003-12-25')]],columns=['string', 'object', 'integer', 'float','datetime'])

In [45]: original["object"] = Series(original["object"], dtype=object)

In [46]: original.index.name = 'index'

In [47]: original
Out[47]: 
       string  object  integer  float            datetime
index                                                    
0      string  object        1    1.1 2003-12-25 00:00:00

In [48]: original.dtypes
Out[48]: 
string              object
object              object
integer              int64
float              float64
datetime    datetime64[ns]
dtype: object
In [50]: original.to_stata('out',{'datetime' : 'tc'})
In [52]: pd.read_stata('out')
Out[52]: 
   index  string  object  integer  float            datetime
0      0  string  object        1    1.1 2003-12-25 00:00:00

In [53]: pd.read_stata('out').dtypes
Out[53]: 
index                int64
string              object
object              object
integer              int64
float              float64
datetime    datetime64[ns]
dtype: object

@wuan
Copy link
Contributor Author

wuan commented Aug 13, 2013

Thanks for the example. The difference here seems to be, that the datetime64 data is contained as a data column and not as table index.

Unfortunately this

dataframe.to_stata('out', {'index':'tc'})

does not help in my example.

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

yep I agree....does this cause a failing test for you?

@wuan
Copy link
Contributor Author

wuan commented Aug 13, 2013

No, I just found this behaviour while looking to reproduce another problem.

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

ok...great

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Feb 18, 2014
@bashtage
Copy link
Contributor

This occurs because the column has no name, and so it by default named 0 (number not '0'), which causes issued with the way the date time conversion works.

This works correctly;

timestamps = pd.date_range(np.datetime64('2013-01-01 11:00:00.123456'),
                           periods=5, freq='123U')
dataframe = pd.DataFrame(np.random.randn(len(timestamps)), index=timestamps,
                         columns=['something'])
dataframe.to_stata('out', convert_dates={'index': 'tc'})

@jreback
Copy link
Contributor

jreback commented Mar 12, 2014

is it s problem to have numeric column labels in general?

@bashtage
Copy link
Contributor

Dunno, working through this now - I think the fix will be simple and won't cause an issue.

@bashtage
Copy link
Contributor

PR #6622

No test yet, but too late over here to write a sensible test tonight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Dtype Conversions Unexpected or buggy dtype conversions IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants