-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Store in SQL using double precision #9041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Store in SQL using double precision #9041
Conversation
We should indeed be safe by default, so opt for what you did now. |
Unless I'm missing something, it's actually surprisingly tricky to differentiate between float32 / float64 in a robust manner. We could check |
I wouldn't worry about the So I would just check the dtype, when there are |
901c605
to
bfc1ee2
Compare
@jorisvandenbossche OK, added checks for float32 and int32 |
0503295
to
a8ba2f2
Compare
Also made a couple of small changes that I though were logical while playing w. this --- passing through dtype argument in to_sql and get_schema, and changed the default floating MYSQL type to be DOUBLE (not SINGLE). |
is_type_class = isinstance(my_type, type) and \ | ||
issubclass(my_type, types.TypeEngine) | ||
is_type_obj = isinstance(my_type, types.TypeEngine) | ||
if not is_type_class and not is_type_obj: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made your suggested changes. I copied and pasted your fix from #9138 for now --- I can remove and rebase once you have merged. Also, I can get the precision to go up to 14 digits. However, to get this I needed to insert some special code into |
1727541
to
98cb55d
Compare
dtype = sqlalchemy.Float if self.mode == 'sqlalchemy' else 'REAL' | ||
create_sql = sql.get_schema(self.test_frame1, 'test', 'sqlite', | ||
con=self.conn, dtype={'D':dtype}) | ||
self.assertTrue('CREATE' in create_sql) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just add here a check that the Float/Real is in the schema
type = 'FLOAT' if self.mode == 'sqlalchemy' else 'REAL'
self.assertTrue(type in create_sql)
but maybe make a custom df with only ints, so we are sure the float is there because of the dtype
argument and not because it was already there (because now, the schema will not be different with or without using the dtype argument)
@artemyk regarding the precision issue (because of the decimal use). I think that looks OK to do (as we should ensure the full precision is available), but I was wondering if this has performance implications? |
@artemyk I merged the other PR, so you can rebase this one |
98cb55d
to
9503c79
Compare
@jorisvandenbossche OK, rebased / upgraded test. |
0907c07
to
8836ab2
Compare
OK, do you know the reason sqlalchemy does this by default? (just wondering) |
|
||
# Avoid casting double-precision floats into decimals | ||
from sqlalchemy import Numeric | ||
for column in tbl.columns.values(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small 'style' comment: the .values
is not needed here, you can just iterate through the tbl.columns
8836ab2
to
1e36981
Compare
Good point about .values .
|
@artemyk I justed tested it with postgresql/psycopg2, and it seems that there, |
@jorisvandenbossche I'm now wondering --- do you think its possible there is a DECIMAL-type datatype in , say, postgres, that can store arbitrary precision decimals? And that the code we have here would force conversion to Float, thereby loosing precision? |
@jorisvandenbossche Can we merge? |
@artemyk Yes! (sorry for dropping the ball here) Can you just rebase? |
65f0253
to
e5b7cb7
Compare
Better test Style fix Whats new fix
e5b7cb7
to
bb0c2e8
Compare
@jorisvandenbossche Rebased... |
ENH: Store in SQL using double precision
@artemyk Thanks for the work on this, and really for letting this take so long! |
Closes #9009 . @jorisvandenbossche -- do you think this should test/use single precision ? We could test for float32 perhaps? My own feeling is that this is the safer default.