-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: dtype costumization on to_sql (GH8778) #8926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@tiagoantao I think in general this is a good approach. Further needs:
|
@@ -922,7 +922,7 @@ def to_msgpack(self, path_or_buf=None, **kwargs): | |||
return packers.to_msgpack(path_or_buf, self, **kwargs) | |||
|
|||
def to_sql(self, name, con, flavor='sqlite', schema=None, if_exists='fail', | |||
index=True, index_label=None, chunksize=None): | |||
index=True, index_label=None, chunksize=None, dtypes={}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a default of None would be better
We should also check that the user provides a SQLAlchemy type and not something else for now I think |
@mangecoeur @hayd @artemyk OK with this addition? |
@@ -954,12 +954,15 @@ def to_sql(self, name, con, flavor='sqlite', schema=None, if_exists='fail', | |||
chunksize : int, default None | |||
If not None, then rows will be written in batches of this size at a | |||
time. If None, all rows will be written at once. | |||
dtypes: optional datatypes for SQL columns (dictionary with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should be a space before the colon (sphinx formatting is rather strict)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the same at all other places
@tiagoantao @jorisvandenbossche I realize this is late in the game, but just a short thought --- this whole situation ( |
@artemyk That is maybe a worthwile thing to look at for that specific issue, but apart from that I think the functionality in this PR is also useful for other things? (so maybe just not say it closes that specific nan issue). |
continuing the discussion on the approach proposed by @artemyk in the issue: #8778 (comment) |
@jorisvandenbossche That makes sense how |
I am taking care of implemeting this on SQLiteDatabase |
I hope this sorts out the issue with SQLite |
for col, my_type in dtypes.items(): | ||
if not issubclass(my_type, type_api.TypeEngine): | ||
raise ValueError('The type of %s is not a SQLAlchemy ' | ||
'type' % col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this check in SQLTable
On the name of the argument, I was thinking it should be But is the name OK? Because it aren't exactly 'dtype's you specify, but sql types? (@jreback) |
@@ -954,12 +954,15 @@ def to_sql(self, name, con, flavor='sqlite', schema=None, if_exists='fail', | |||
chunksize : int, default None | |||
If not None, then rows will be written in batches of this size at a | |||
time. If None, all rows will be written at once. | |||
dtypes : optional datatypes for SQL columns (dictionary with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to adapt the docstring here I think
for col, my_type in dtypes.items(): | ||
if my_type not in _SQL_TYPES.keys(): | ||
raise ValueError('%s (%s) not a SQLite Type' % | ||
(col, my_type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do this check here, as actually SQLite is very liberal in what it accepts as a type name (actually everything). So maybe we should just check that is has to be a string?
@jorisvandenbossche I think |
OK, Tiago, can you:
and then I think it should be ready to merge! |
I have done my first rebase (quite ridiculous, but I have never done it in the past), so I am not sure if this was correctly done, could you please have a look and advise? All the rest is, hopefully, done. |
@tiagoantao Not fully I think. You have a squashed commit here, but also still the old ones. Normally simply doing following steps should resolve it:
If you have problems with it, just ask! |
I should have forced the push. I hope this will work now. Thanks for all your help and patience with this. |
Yep, that is better! |
@@ -857,7 +860,7 @@ def _harmonize_columns(self, parse_dates=None): | |||
col_type = self._numpy_type(sql_col.type) | |||
|
|||
if col_type is datetime or col_type is date: | |||
if not issubclass(df_col.dtype.type, np.datetime64): | |||
if not isinstance(df_col.dtype.type, np.datetime64): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for this change? (because if so, better to do in a separate commit + a test for the case this change was needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was in auto-mode looking for this on my patch. This was completely
unintended. Sorry. Will correct.
testing dtypes parameter dtypes defaults to None dtype type checking and docstrings dtype exception checking sphinx dtypes corrections if/else to or simplification informative exception of errouneous SQLAlchemy subclassing type checking basic documentation of the dtypes feature issue number correct test position issue correction SQLite dtype configuration Testing Legagy SQLite with dtype configuration changed the position of a dtype check assert_raise assert_raise return user specified dtype, not SQL_TYPE test cleanup better docstrings better docstrings docs and test refactoring Do not test on MySQL legacy dtypes->dtype dtypes->dtype assert->assertTrue Type test in mysql correct mysql test type reverting unintended change
ENH: dtype costumization on to_sql (GH8778)
@tiagoantao Thanks a lot for this nice PR! |
This is the proposed general gist of the changes. My ad-hoc testing suggests that this might work. If this is an acceptible design, I will proceed to make the formal test and change the docs (including docstrings)
Closes #8778