Skip to content

ENH: Save BigQuery account credentials in a hidden user folder #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Changelog

- :func:`read_gbq` now raises ``QueryTimeout`` if the request exceeds the ``query.timeoutMs`` value specified in the BigQuery configuration. (:issue:`76`)
- Environment variable ``PANDAS_GBQ_CREDENTIALS_FILE`` can now be used to override the default location where the BigQuery user account credentials are stored. (:issue:`86`)

- BigQuery user account credentials are now stored in an application-specific hidden user folder on the operating system. (:issue:`41`)

0.2.0 / 2017-07-24
------------------
Expand Down
4 changes: 3 additions & 1 deletion docs/source/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,9 @@ is possible with either user or service account credentials.
Authentication via user account credentials is as simple as following the prompts in a browser window
which will automatically open for you. You authenticate to the specified
``BigQuery`` account using the product name ``pandas GBQ``.
The remote authentication is supported via specifying ``auth_local_webserver`` in ``read_gbq``.
The remote authentication is supported via the ``auth_local_webserver`` in ``read_gbq``. By default,
account credentials are stored in an application-specific hidden user folder on the operating system. You
can override the default credentials location via the ``PANDAS_GBQ_CREDENTIALS_FILE`` environment variable.
Additional information on the authentication mechanism can be found
`here <https://developers.google.com/identity/protocols/OAuth2#clientside/>`__.

Expand Down
47 changes: 44 additions & 3 deletions pandas_gbq/gbq.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ def __init__(self, project_id, reauth=False, verbose=False,
self.private_key = private_key
self.auth_local_webserver = auth_local_webserver
self.dialect = dialect
self.credentials_path = _get_credentials_file()
self.credentials = self.get_credentials()
self.service = self.get_service()

Expand Down Expand Up @@ -279,8 +280,21 @@ def load_user_account_credentials(self):
from google_auth_httplib2 import Request
from google.oauth2.credentials import Credentials

# Use the default credentials location under ~/.config and the
# equivalent directory on windows if the user has not specified a
# credentials path.
if not self.credentials_path:
self.credentials_path = self.get_default_credentials_path()

# Previously, pandas-gbq saved user account credentials in the
# current working directory. If the bigquery_credentials.dat file
# exists in the current working directory, move the credentials to
# the new default location.
if os.path.isfile('bigquery_credentials.dat'):
os.rename('bigquery_credentials.dat', self.credentials_path)

try:
with open(_get_credentials_file()) as credentials_file:
with open(self.credentials_path) as credentials_file:
credentials_json = json.load(credentials_file)
except (IOError, ValueError):
return None
Expand All @@ -301,14 +315,41 @@ def load_user_account_credentials(self):

return _try_credentials(self.project_id, credentials)

def get_default_credentials_path(self):
"""
Gets the default path to the BigQuery credentials

.. versionadded 0.3.0

Returns
-------
Path to the BigQuery credentials
"""

import os

if os.name == 'nt':
config_path = os.environ['APPDATA']
else:
config_path = os.path.join(os.path.expanduser('~'), '.config')

config_path = os.path.join(config_path, 'pandas_gbq')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know this better than I do, but is there a case for having the default path be the default gcloud path, rather than anything specific to pandas?

Copy link
Contributor Author

@parthea parthea Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great question! I've given this some thought and I don't feel comfortable writing default credentials into the gcloud path in case some users don't want to have their default gcloud credentials set (and I'm also worried about colliding with gcloud). Alternatively, users could run gcloud auth application-default login if they want to configure default credentials and the default gcloud credentials should be picked up. In that case, we shouldn't hit this code path. https://github.com/pydata/pandas-gbq/blob/master/pandas_gbq/gbq.py#L222

@tswast Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to not clobbering gcloud credentials. The token we save doesn't have the full Google Cloud Platform scope so that would break other applications that depend on those credentials.

If folks set up application default credentials it won't hit this code anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users could run gcloud auth application-default login

If folks set up application default credentials it won't hit this code anyway.

👍


# Create a pandas_gbq directory in an application-specific hidden
# user folder on the operating system.
if not os.path.exists(config_path):
os.makedirs(config_path)

return os.path.join(config_path, 'bigquery_credentials.dat')

def save_user_account_credentials(self, credentials):
"""
Saves user account credentials to a local file.

.. versionadded 0.2.0
"""
try:
with open(_get_credentials_file(), 'w') as credentials_file:
with open(self.credentials_path, 'w') as credentials_file:
credentials_json = {
'refresh_token': credentials.refresh_token,
'id_token': credentials.id_token,
Expand Down Expand Up @@ -793,7 +834,7 @@ def delete_and_recreate_table(self, dataset_id, table_id, table_schema):

def _get_credentials_file():
return os.environ.get(
'PANDAS_GBQ_CREDENTIALS_FILE', 'bigquery_credentials.dat')
'PANDAS_GBQ_CREDENTIALS_FILE')


def _parse_data(schema, rows):
Expand Down
1 change: 1 addition & 0 deletions pandas_gbq/tests/test_gbq.py
Original file line number Diff line number Diff line change
Expand Up @@ -1388,6 +1388,7 @@ def setup_method(self, method):
# put here any instruction you want to be run *BEFORE* *EVERY* test
# is executed.

gbq.GbqConnector(_get_project_id(), auth_local_webserver=True)
self.dataset_prefix = _get_dataset_prefix_random()
clean_gbq_environment(self.dataset_prefix)
self.destination_table = "{0}{1}.{2}".format(self.dataset_prefix, "2",
Expand Down