Skip to content

ENH: sql support for Timestamp (GH7103) #8205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jorisvandenbossche
Copy link
Member

Superceded by #8208 (converting to object dtype also converts the datetime64 values to datetime.datetime, so it is no longer needed that sqlalchemy can work with pandas' Timestamp type.


Closes #7103, #7936

This adds a pandas.io.sql.Timestamp class to handle Timestamps in to_sql. This basically converts it to a datetime.datetime before writing to the database.

Problem is that this makes it slower (I tested it with sqlite, and it's for a dataframe with only a datetime64 column about 20% slower), while it did already work before for some drivers (like psycopg2 and MySQLdb).

@jorisvandenbossche jorisvandenbossche added the IO SQL to_sql, read_sql, read_sql_query label Sep 7, 2014
@jorisvandenbossche jorisvandenbossche added this to the 0.15.0 milestone Sep 7, 2014
@jreback
Copy link
Contributor

jreback commented Sep 7, 2014

so just do this for certain database types then (will prob be similar for timedelta)

@jorisvandenbossche
Copy link
Member Author

It depends on the driver, not the database type. But indeed, we could do that, but that feels a bit clumsy. Also, I tested it with psycopg2/pymysql/MySQLdb/mysql.connector, but there a lot more drivers for which I don't know the behaviour (and also not for different versions of the drivers).

While testing the fix for NaN values, I noticed that doing df.astype(object) (needed to get in None values, and not NaN) also converts Timestamp/datetime64 to datetime.datetime. Is this the expected behaviour? I would have expected individual Timestamp objects.

@jreback
Copy link
Contributor

jreback commented Sep 7, 2014

@jorisvandenbossche that is expected. the object dtype is a ndarray of datetime objects (I guess for compat reasons).

You might want to preconvert any datetimes/timedeltas at the start (iow, separate the frame into various 'blocks'), which you then iterate all together. Don't try to concat (or they will be re-coerced)
And to be honest you can simply do this for types that need NaN -> None as well, e.g. drop down to numpy object arrays (or rec-arrays). Might be a bit of work at first, but then you can easily do what you need quickly. I actually do this for PyTables, see here, (and the next method, where I create a structured/rec array), and fill with already coerced values (e.g. datetime64[ns] have already by tz converted and are now int64 and such, strings are already an appropriate dtype, etc.)

@jorisvandenbossche
Copy link
Member Author

Superceded by #8208

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: full SQL support for datetime64 values
2 participants