-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH/DOC: update pandas-gbq signature and docstring #20564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 12 commits
ea52387
b1b1479
0429227
bee847b
2e5b148
cb178d9
f09d38e
b6fdf37
37b2a08
f6c38f0
0889d6d
aa99d47
1ab0934
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1116,60 +1116,90 @@ def to_dict(self, orient='dict', into=dict): | |
else: | ||
raise ValueError("orient '%s' not understood" % orient) | ||
|
||
def to_gbq(self, destination_table, project_id, chunksize=10000, | ||
verbose=True, reauth=False, if_exists='fail', private_key=None): | ||
"""Write a DataFrame to a Google BigQuery table. | ||
|
||
The main method a user calls to export pandas DataFrame contents to | ||
Google BigQuery table. | ||
def to_gbq(self, destination_table, project_id, chunksize=None, | ||
verbose=None, reauth=False, if_exists='fail', private_key=None, | ||
auth_local_webserver=False, table_schema=None): | ||
""" | ||
Write a DataFrame to a Google BigQuery table. | ||
|
||
Google BigQuery API Client Library v2 for Python is used. | ||
Documentation is available `here | ||
<https://developers.google.com/api-client-library/python/apis/bigquery/v2>`__ | ||
This function requires the `pandas-gbq package | ||
<https://pandas-gbq.readthedocs.io>`__. | ||
|
||
Authentication to the Google BigQuery service is via OAuth 2.0. | ||
|
||
- If "private_key" is not provided: | ||
- If ``private_key`` is provided, the library loads the JSON service | ||
account credentials and uses those to authenticate. | ||
|
||
By default "application default credentials" are used. | ||
- If no ``private_key`` is provided, the library tries `application | ||
default credentials`_. | ||
|
||
If default application credentials are not found or are restrictive, | ||
user account credentials are used. In this case, you will be asked to | ||
grant permissions for product name 'pandas GBQ'. | ||
.. _application default credentials: | ||
https://cloud.google.com/docs/authentication/production#providing_credentials_to_your_application | ||
|
||
- If "private_key" is provided: | ||
|
||
Service account credentials will be used to authenticate. | ||
- If application default credentials are not found or cannot be used | ||
with BigQuery, the library authenticates with user account | ||
credentials. In this case, you will be asked to grant permissions | ||
for product name 'pandas GBQ'. | ||
|
||
Parameters | ||
---------- | ||
dataframe : DataFrame | ||
DataFrame to be written | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this needs to stay There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The validation script was complaining about this one. I think because this is a method of
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh don’t worry about that too much There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think @tswast is correct that this should be removed. The method does not take a dataframe as input (it is writing |
||
destination_table : string | ||
Name of table to be written, in the form 'dataset.tablename' | ||
destination_table : str | ||
Name of table to be written, in the form 'dataset.tablename'. | ||
project_id : str | ||
Google BigQuery Account project ID. | ||
chunksize : int (default 10000) | ||
chunksize : int, optional | ||
Number of rows to be inserted in each chunk from the dataframe. | ||
verbose : boolean (default True) | ||
Show percentage complete | ||
reauth : boolean (default False) | ||
Set to ``None`` to load the whole dataframe at once. | ||
reauth : bool, default False | ||
Force Google BigQuery to reauthenticate the user. This is useful | ||
if multiple accounts are used. | ||
if_exists : {'fail', 'replace', 'append'}, default 'fail' | ||
'fail': If table exists, do nothing. | ||
'replace': If table exists, drop it, recreate it, and insert data. | ||
'append': If table exists, insert data. Create if does not exist. | ||
private_key : str (optional) | ||
if_exists : str, default 'fail' | ||
Behavior when the destination table exists. Value can be one of: | ||
|
||
``'fail'`` | ||
If table exists, do nothing. | ||
``'replace'`` | ||
If table exists, drop it, recreate it, and insert data. | ||
``'append'`` | ||
If table exists, insert data. Create if does not exist. | ||
private_key : str, optional | ||
Service account private key in JSON format. Can be file path | ||
or string contents. This is useful for remote server | ||
authentication (eg. Jupyter/IPython notebook on remote host) | ||
""" | ||
authentication (eg. Jupyter/IPython notebook on remote host). | ||
auth_local_webserver : bool, default False | ||
Use the `local webserver flow`_ instead of the `console flow`_ | ||
when getting user credentials. | ||
|
||
.. _local webserver flow: | ||
http://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_local_server | ||
.. _console flow: | ||
http://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_console | ||
|
||
*New in version 0.2.0 of pandas-gbq*. | ||
table_schema : list of dicts, optional | ||
List of BigQuery table fields to which according DataFrame | ||
columns conform to, e.g. ``[{'name': 'col1', 'type': | ||
'STRING'},...]``. If schema is not provided, it will be | ||
generated according to dtypes of DataFrame columns. See | ||
BigQuery API documentation on available names of a field. | ||
|
||
*New in version 0.3.1 of pandas-gbq*. | ||
verbose : boolean, deprecated | ||
*Deprecated in Pandas-GBQ 0.4.0.* Use the `logging module | ||
to adjust verbosity instead | ||
<https://pandas-gbq.readthedocs.io/en/latest/intro.html#logging>`__. | ||
|
||
See Also | ||
-------- | ||
pandas_gbq.to_gbq : This function in the pandas-gbq library. | ||
pandas.read_gbq : Read a DataFrame from Google BigQuery. | ||
""" | ||
from pandas.io import gbq | ||
return gbq.to_gbq(self, destination_table, project_id=project_id, | ||
chunksize=chunksize, verbose=verbose, reauth=reauth, | ||
if_exists=if_exists, private_key=private_key) | ||
return gbq.to_gbq( | ||
self, destination_table, project_id, chunksize=chunksize, | ||
verbose=verbose, reauth=reauth, if_exists=if_exists, | ||
private_key=private_key, auth_local_webserver=auth_local_webserver, | ||
table_schema=table_schema) | ||
|
||
@classmethod | ||
def from_records(cls, data, index=None, exclude=None, columns=None, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "documentation" -> "signature and documentation" (as it is not only a doc change ?)
and move it to "other enhancements" section