-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: df.to_sql chunksize
seems to be ignored by default.
#35891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
FWIW I tested the above on |
Yeah can you make a PR with your suggested change to the docs? Not sure about avoiding method="multi". It was added later, and we likely kept the default for backwards compatibility. |
Thanks! Just opened #36172 to change the docs. I agree it's best to leave the default as it is for backwards compatibility. |
So the code where Lines 797 to 831 in 6aa311d
From that code snippet, it doesn't appear that But it might certainly be that the effect of the |
Thanks for looking into this! If I understand correctly from this snippet, the So if Please correct me if I'm misunderstanding. |
@jorisvandenbossche I just came across this problem again today and remembered this thread. You're absolutely right that my confusion is about effect. As you point out, the Specifically, I was surprised that this snippet:
sends ~10k short TCP packets to the database, even though "By default, all rows will be written at once". I think the docs can be improved by guiding the user to also look at the What do you think about the following docs change?
|
take |
Location of the documentation
https://dev.pandas.io/docs/reference/api/pandas.DataFrame.to_sql.html
Documentation problem
Docs for
chunksize
state that "By default, all rows will be written at once". However, this seems to only be true ifmethod="multi"
, even thoughmethod=None
by default.Experimentally, I notice a substantial speedup if I set
method="multi"
andchunksize
is either unset or large, butchunksize
seems to have no effect if I don't setmethod
.Suggested fix for documentation
Documentation for
chunksize
should also reference themethod
argument. For example. we could revise to:Also, are there many uses where users should avoid
method="multi"
? If not, would it make sense to change the default?The text was updated successfully, but these errors were encountered: