Skip to content

ENH: SQL through SQLAlchemy - performance #6416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks
mangecoeur opened this issue Feb 20, 2014 · 8 comments
Closed
3 tasks

ENH: SQL through SQLAlchemy - performance #6416

mangecoeur opened this issue Feb 20, 2014 · 8 comments
Labels
IO SQL to_sql, read_sql, read_sql_query Performance Memory or execution speed performance

Comments

@mangecoeur
Copy link
Contributor

This issue is to track performance measurement and improvements to the SQLalchemy-based SQL io support in pandas.

TODO

  • Get perfromance measurements working
  • Determine bottlenecks
  • Address bottleneck
@jreback jreback added this to the 0.14.0 milestone Feb 20, 2014
@mangecoeur
Copy link
Contributor Author

First improvements to write performance #6417

@jreback
Copy link
Contributor

jreback commented Feb 20, 2014

ideally you can write these as vbenches ?

@cpcloud
Copy link
Member

cpcloud commented Feb 20, 2014

@mangecoeur GitHub has checkboxes! They're pretty neat.

mangecoeur added a commit to mangecoeur/pandas that referenced this issue Feb 20, 2014
@jreback
Copy link
Contributor

jreback commented Feb 20, 2014

should add to the ecosystem page? https://github.com/yhat/pandasql

mangecoeur added a commit to mangecoeur/pandas that referenced this issue Feb 20, 2014
jreback added a commit that referenced this issue Feb 20, 2014
ENH #6416: performance improvements on write
gouthambs pushed a commit to gouthambs/pandas that referenced this issue Mar 12, 2014
gouthambs pushed a commit to gouthambs/pandas that referenced this issue Mar 12, 2014
@jreback
Copy link
Contributor

jreback commented May 19, 2014

@jorisvandenbossche for 0.14.1?

maybe add some vbenches in 0.14.0 if you guys can

@jreback jreback modified the milestones: 0.14.1, 0.14.0 May 22, 2014
@balancap
Copy link

balancap commented Jun 6, 2014

I am not an SQL and SQLAlchemy expert, but I have observed that for any reading operation, a couple of requests to the database are needed before the SELECT

  • SHOW FULL TABLES
  • SHOW CREATE TABLE

I think it would be interesting to be able to optionally desactivate them, as from what I have measured, they can be costly if the database server is distant and has big latency.

I have not look yet at the implementation inside Pandas, but do you think it could be possible ?

@jorisvandenbossche
Copy link
Member

@balancap created a new issue about this: #7396 (I suppose what I report there is the reason for these requests you report here. Do you think that could be correct?).

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.15.0, 0.14.1 Jul 1, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@mroeschke
Copy link
Member

We have SQL benchmarks in our ASV suite. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO SQL to_sql, read_sql, read_sql_query Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

6 participants