Skip to content

read_gbq should offer an option to save to dataset #13531

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iros opened this issue Jun 29, 2016 · 6 comments
Closed

read_gbq should offer an option to save to dataset #13531

iros opened this issue Jun 29, 2016 · 6 comments
Labels
Duplicate Report Duplicate issue or pull request

Comments

@iros
Copy link

iros commented Jun 29, 2016

Right now read_gbq runs a query and retrieves the results into a data_frame. If one wants to also save it into a dataset table, one then has to call to_gbq which would stream the results back.

The bq command line tool has an option to pass a destination dataset table immediately on query to store the results.

cat queries/sample.sql | bq query --format csv 
  --destination_table=collection.table_name > output.csv

Ideally read_gbq would have an additional option like destination_table that could store the results immediately.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jun 29, 2016

cc @parthea @aaront @jacobschaer

@iros
Copy link
Author

iros commented Jun 30, 2016

It looks like the configuration obj for the query job here can take a destinationTable parameter, so we just have to carry it through:

https://github.com/pydata/pandas/blob/f71537ab2561ab5727008095e9685966619fa7b9/pandas/io/gbq.py#L325-L331

@jreback
Copy link
Contributor

jreback commented Jul 1, 2016

xref to #10474 not really sure we should support this

@parthea
Copy link
Contributor

parthea commented Jul 7, 2016

I'm ok with closing this feature request. I'll keep PR #11209 in my back pocket in case this gets raised again.

Looking at #11209 again, it would be cleaner to create a new api function gbq.query_to_table() rather than re-define the existing read_gbq() which returns a DataFrame

@parthea
Copy link
Contributor

parthea commented Sep 11, 2016

@jreback Can we close this as duplicate of #10474 , and continue the discussion there?

@jreback
Copy link
Contributor

jreback commented Sep 11, 2016

sure

@jreback jreback closed this as completed Sep 11, 2016
@jreback jreback added the Duplicate Report Duplicate issue or pull request label Sep 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

4 participants