Skip to content

Commit 2431641

Browse files
tswastjorisvandenbossche
authored andcommitted
ENH/DOC: update pandas-gbq signature and docstring (#20564)
Delegates more of the behavior and documentation for `to_gbq` and `read_gbq` methods to the `pandas-gbq` library. This duplicate documentation was getting out of sync.
1 parent eb168b7 commit 2431641

File tree

4 files changed

+112
-67
lines changed

4 files changed

+112
-67
lines changed

doc/source/conf.py

+1
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,7 @@
350350
intersphinx_mapping = {
351351
'statsmodels': ('http://www.statsmodels.org/devel/', None),
352352
'matplotlib': ('http://matplotlib.org/', None),
353+
'pandas-gbq': ('https://pandas-gbq.readthedocs.io/en/latest/', None),
353354
'python': ('https://docs.python.org/3/', None),
354355
'numpy': ('https://docs.scipy.org/doc/numpy/', None),
355356
'scipy': ('https://docs.scipy.org/doc/scipy/reference/', None),

doc/source/whatsnew/v0.23.0.txt

+4-1
Original file line numberDiff line numberDiff line change
@@ -404,7 +404,10 @@ Other Enhancements
404404
- :func:`read_html` now accepts a ``displayed_only`` keyword argument to controls whether or not hidden elements are parsed (``True`` by default) (:issue:`20027`)
405405
- zip compression is supported via ``compression=zip`` in :func:`DataFrame.to_pickle`, :func:`Series.to_pickle`, :func:`DataFrame.to_csv`, :func:`Series.to_csv`, :func:`DataFrame.to_json`, :func:`Series.to_json`. (:issue:`17778`)
406406
- :class:`DataFrame` and :class:`Series` now support matrix multiplication (```@```) operator (:issue:`10259`) for Python>=3.5
407-
407+
- Updated ``to_gbq`` and ``read_gbq`` signature and documentation to reflect changes from
408+
the Pandas-GBQ library version 0.4.0. Adds intersphinx mapping to Pandas-GBQ
409+
library. (:issue:`20564`)
410+
408411
.. _whatsnew_0230.api_breaking:
409412

410413
Backwards incompatible API changes

pandas/core/frame.py

+65-35
Original file line numberDiff line numberDiff line change
@@ -1100,60 +1100,90 @@ def to_dict(self, orient='dict', into=dict):
11001100
else:
11011101
raise ValueError("orient '%s' not understood" % orient)
11021102

1103-
def to_gbq(self, destination_table, project_id, chunksize=10000,
1104-
verbose=True, reauth=False, if_exists='fail', private_key=None):
1105-
"""Write a DataFrame to a Google BigQuery table.
1106-
1107-
The main method a user calls to export pandas DataFrame contents to
1108-
Google BigQuery table.
1103+
def to_gbq(self, destination_table, project_id, chunksize=None,
1104+
verbose=None, reauth=False, if_exists='fail', private_key=None,
1105+
auth_local_webserver=False, table_schema=None):
1106+
"""
1107+
Write a DataFrame to a Google BigQuery table.
11091108
1110-
Google BigQuery API Client Library v2 for Python is used.
1111-
Documentation is available `here
1112-
<https://developers.google.com/api-client-library/python/apis/bigquery/v2>`__
1109+
This function requires the `pandas-gbq package
1110+
<https://pandas-gbq.readthedocs.io>`__.
11131111
11141112
Authentication to the Google BigQuery service is via OAuth 2.0.
11151113
1116-
- If "private_key" is not provided:
1114+
- If ``private_key`` is provided, the library loads the JSON service
1115+
account credentials and uses those to authenticate.
11171116
1118-
By default "application default credentials" are used.
1117+
- If no ``private_key`` is provided, the library tries `application
1118+
default credentials`_.
11191119
1120-
If default application credentials are not found or are restrictive,
1121-
user account credentials are used. In this case, you will be asked to
1122-
grant permissions for product name 'pandas GBQ'.
1120+
.. _application default credentials:
1121+
https://cloud.google.com/docs/authentication/production#providing_credentials_to_your_application
11231122
1124-
- If "private_key" is provided:
1125-
1126-
Service account credentials will be used to authenticate.
1123+
- If application default credentials are not found or cannot be used
1124+
with BigQuery, the library authenticates with user account
1125+
credentials. In this case, you will be asked to grant permissions
1126+
for product name 'pandas GBQ'.
11271127
11281128
Parameters
11291129
----------
1130-
dataframe : DataFrame
1131-
DataFrame to be written
1132-
destination_table : string
1133-
Name of table to be written, in the form 'dataset.tablename'
1130+
destination_table : str
1131+
Name of table to be written, in the form 'dataset.tablename'.
11341132
project_id : str
11351133
Google BigQuery Account project ID.
1136-
chunksize : int (default 10000)
1134+
chunksize : int, optional
11371135
Number of rows to be inserted in each chunk from the dataframe.
1138-
verbose : boolean (default True)
1139-
Show percentage complete
1140-
reauth : boolean (default False)
1136+
Set to ``None`` to load the whole dataframe at once.
1137+
reauth : bool, default False
11411138
Force Google BigQuery to reauthenticate the user. This is useful
11421139
if multiple accounts are used.
1143-
if_exists : {'fail', 'replace', 'append'}, default 'fail'
1144-
'fail': If table exists, do nothing.
1145-
'replace': If table exists, drop it, recreate it, and insert data.
1146-
'append': If table exists, insert data. Create if does not exist.
1147-
private_key : str (optional)
1140+
if_exists : str, default 'fail'
1141+
Behavior when the destination table exists. Value can be one of:
1142+
1143+
``'fail'``
1144+
If table exists, do nothing.
1145+
``'replace'``
1146+
If table exists, drop it, recreate it, and insert data.
1147+
``'append'``
1148+
If table exists, insert data. Create if does not exist.
1149+
private_key : str, optional
11481150
Service account private key in JSON format. Can be file path
11491151
or string contents. This is useful for remote server
1150-
authentication (eg. Jupyter/IPython notebook on remote host)
1151-
"""
1152+
authentication (eg. Jupyter/IPython notebook on remote host).
1153+
auth_local_webserver : bool, default False
1154+
Use the `local webserver flow`_ instead of the `console flow`_
1155+
when getting user credentials.
1156+
1157+
.. _local webserver flow:
1158+
http://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_local_server
1159+
.. _console flow:
1160+
http://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_console
1161+
1162+
*New in version 0.2.0 of pandas-gbq*.
1163+
table_schema : list of dicts, optional
1164+
List of BigQuery table fields to which according DataFrame
1165+
columns conform to, e.g. ``[{'name': 'col1', 'type':
1166+
'STRING'},...]``. If schema is not provided, it will be
1167+
generated according to dtypes of DataFrame columns. See
1168+
BigQuery API documentation on available names of a field.
1169+
1170+
*New in version 0.3.1 of pandas-gbq*.
1171+
verbose : boolean, deprecated
1172+
*Deprecated in Pandas-GBQ 0.4.0.* Use the `logging module
1173+
to adjust verbosity instead
1174+
<https://pandas-gbq.readthedocs.io/en/latest/intro.html#logging>`__.
11521175
1176+
See Also
1177+
--------
1178+
pandas_gbq.to_gbq : This function in the pandas-gbq library.
1179+
pandas.read_gbq : Read a DataFrame from Google BigQuery.
1180+
"""
11531181
from pandas.io import gbq
1154-
return gbq.to_gbq(self, destination_table, project_id=project_id,
1155-
chunksize=chunksize, verbose=verbose, reauth=reauth,
1156-
if_exists=if_exists, private_key=private_key)
1182+
return gbq.to_gbq(
1183+
self, destination_table, project_id, chunksize=chunksize,
1184+
verbose=verbose, reauth=reauth, if_exists=if_exists,
1185+
private_key=private_key, auth_local_webserver=auth_local_webserver,
1186+
table_schema=table_schema)
11571187

11581188
@classmethod
11591189
def from_records(cls, data, index=None, exclude=None, columns=None,

pandas/io/gbq.py

+42-31
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,10 @@ def _try_import():
2222

2323

2424
def read_gbq(query, project_id=None, index_col=None, col_order=None,
25-
reauth=False, verbose=True, private_key=None, dialect='legacy',
25+
reauth=False, verbose=None, private_key=None, dialect='legacy',
2626
**kwargs):
27-
r"""Load data from Google BigQuery.
28-
29-
The main method a user calls to execute a Query in Google BigQuery
30-
and read results into a pandas DataFrame.
27+
"""
28+
Load data from Google BigQuery.
3129
3230
This function requires the `pandas-gbq package
3331
<https://pandas-gbq.readthedocs.io>`__.
@@ -49,32 +47,39 @@ def read_gbq(query, project_id=None, index_col=None, col_order=None,
4947
Parameters
5048
----------
5149
query : str
52-
SQL-Like Query to return data values
50+
SQL-Like Query to return data values.
5351
project_id : str
5452
Google BigQuery Account project ID.
55-
index_col : str (optional)
56-
Name of result column to use for index in results DataFrame
57-
col_order : list(str) (optional)
53+
index_col : str, optional
54+
Name of result column to use for index in results DataFrame.
55+
col_order : list(str), optional
5856
List of BigQuery column names in the desired order for results
59-
DataFrame
60-
reauth : boolean (default False)
57+
DataFrame.
58+
reauth : boolean, default False
6159
Force Google BigQuery to reauthenticate the user. This is useful
6260
if multiple accounts are used.
63-
verbose : boolean (default True)
64-
Verbose output
65-
private_key : str (optional)
61+
private_key : str, optional
6662
Service account private key in JSON format. Can be file path
6763
or string contents. This is useful for remote server
68-
authentication (eg. Jupyter/IPython notebook on remote host)
69-
70-
dialect : {'legacy', 'standard'}, default 'legacy'
71-
'legacy' : Use BigQuery's legacy SQL dialect.
72-
'standard' : Use BigQuery's standard SQL, which is
73-
compliant with the SQL 2011 standard. For more information
74-
see `BigQuery SQL Reference
75-
<https://cloud.google.com/bigquery/sql-reference/>`__
76-
77-
`**kwargs` : Arbitrary keyword arguments
64+
authentication (eg. Jupyter/IPython notebook on remote host).
65+
dialect : str, default 'legacy'
66+
SQL syntax dialect to use. Value can be one of:
67+
68+
``'legacy'``
69+
Use BigQuery's legacy SQL dialect. For more information see
70+
`BigQuery Legacy SQL Reference
71+
<https://cloud.google.com/bigquery/docs/reference/legacy-sql>`__.
72+
``'standard'``
73+
Use BigQuery's standard SQL, which is
74+
compliant with the SQL 2011 standard. For more information
75+
see `BigQuery Standard SQL Reference
76+
<https://cloud.google.com/bigquery/docs/reference/standard-sql/>`__.
77+
verbose : boolean, deprecated
78+
*Deprecated in Pandas-GBQ 0.4.0.* Use the `logging module
79+
to adjust verbosity instead
80+
<https://pandas-gbq.readthedocs.io/en/latest/intro.html#logging>`__.
81+
kwargs : dict
82+
Arbitrary keyword arguments.
7883
configuration (dict): query config parameters for job processing.
7984
For example:
8085
@@ -86,8 +91,12 @@ def read_gbq(query, project_id=None, index_col=None, col_order=None,
8691
Returns
8792
-------
8893
df: DataFrame
89-
DataFrame representing results of query
94+
DataFrame representing results of query.
9095
96+
See Also
97+
--------
98+
pandas_gbq.read_gbq : This function in the pandas-gbq library.
99+
pandas.DataFrame.to_gbq : Write a DataFrame to Google BigQuery.
91100
"""
92101
pandas_gbq = _try_import()
93102
return pandas_gbq.read_gbq(
@@ -99,10 +108,12 @@ def read_gbq(query, project_id=None, index_col=None, col_order=None,
99108
**kwargs)
100109

101110

102-
def to_gbq(dataframe, destination_table, project_id, chunksize=10000,
103-
verbose=True, reauth=False, if_exists='fail', private_key=None):
111+
def to_gbq(dataframe, destination_table, project_id, chunksize=None,
112+
verbose=None, reauth=False, if_exists='fail', private_key=None,
113+
auth_local_webserver=False, table_schema=None):
104114
pandas_gbq = _try_import()
105-
pandas_gbq.to_gbq(dataframe, destination_table, project_id,
106-
chunksize=chunksize,
107-
verbose=verbose, reauth=reauth,
108-
if_exists=if_exists, private_key=private_key)
115+
return pandas_gbq.to_gbq(
116+
dataframe, destination_table, project_id, chunksize=chunksize,
117+
verbose=verbose, reauth=reauth, if_exists=if_exists,
118+
private_key=private_key, auth_local_webserver=auth_local_webserver,
119+
table_schema=table_schema)

0 commit comments

Comments
 (0)