-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH #6416: performance improvements on write #6420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mangecoeur
commented
Feb 20, 2014
- tradoff higher memory use for faster writes. This replaces the earlier PR where the history was a mess!
…er memory use for faster writes.
data = dict((k, self.maybe_asscalar(v)) | ||
for k, v in t[1].iteritems()) | ||
data_list.append(data) | ||
#self.pd_sql.execute(ins, **data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed?
@mangecoeur instead of using iterrows.....just do the loop directly; will be much faster as you are not creating a Series (which you then decompose) each time e.g. just do:
this munges dtypes btw...e.g. everything gets put into a single dtype. I don't think this matters though? you might want to consider doing this by dtype, e.g. use df.as_blocks (which you then select the correct column out of each block). the advantage is that they are already dtype separated |
@jreback ok i see what you mean. will look into it. |
@mangecoeur this latter approach is used in
etc |
@jreback I think we can keep it simple, since the column types are defined on the SQLAlchemy side by the DB table, sqlalchemy deals with converting python values to SQL types already, all we need is to supply a list of row dicts. Maybe we can optimize this more later. |
sure....just some thoughts....always profile of course! |
@mangecoeur Maybe you can add a vbench to see the effect of this PR? |
@jorisvandenbossche not got vbench working yet. was just from daily use found it was much faster this way :P will get round to it soonish |
@jreback had a look at that method, turns out that if you iterate over values you get problems with datetime conversion which using |
@mangecoeur yeh....as I said, the so ready to merge then? |
@jreback ok, just cleaned up based on the comments, should be good to go once travis has done its thing. |
ENH #6416: performance improvements on write
@mangecoeur thanks! |