Skip to content

GbqConnector should be able to fetch default credentials on Google Compute Engine #13608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from
Closed
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4475,6 +4475,15 @@ Additional information on service accounts can be found

You will need to install an additional dependency: `oauth2client <https://github.com/google/oauth2client>`__.

.. versionadded:: 0.19.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be below the change


Authentication via ``application default credentials`` is also possible. This is only valid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverse this logic. If private_key......, otherwise authentication via ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I don't completely understand this. Other authentication methods are already mentioned above this. I've just added the information for the new method. Could you please share the text that you want me to put here - if you have time. Thanks!

if the parameter ``private_key`` is not provided. This method also requires that
the credentials can be fetched from the environment the code is running in.
Otherwise, the OAuth2 client-side authentication is used.
Additional information on ``application default credentials`` can be found
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional information on application default credentials <https...........>__

`here <https://developers.google.com/identity/protocols/application-default-credentials>`__.

.. note::

The `'private_key'` parameter can be set to either the file path of the service account key
Expand Down
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,8 @@ Google BigQuery Enhancements
Other enhancements
^^^^^^^^^^^^^^^^^^

- The ``.get_credentials()`` method of ``GbqConnector`` can now first try to fetch [the application default credentials](https://developers.google.com/identity/protocols/application-default-credentials). See the :ref:`docs <io.bigquery_authentication>` for more details (:issue:`13577`).

- The ``.tz_localize()`` method of ``DatetimeIndex`` and ``Timestamp`` has gained the ``errors`` keyword, so you can potentially coerce nonexistent timestamps to ``NaT``. The default behaviour remains to raising a ``NonExistentTimeError`` (:issue:`13057`)
- ``pd.to_numeric()`` now accepts a ``downcast`` parameter, which will downcast the data if possible to smallest specified numerical dtype (:issue:`13352`)

Expand Down
80 changes: 71 additions & 9 deletions pandas/io/gbq.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,57 @@ def get_credentials(self):
if self.private_key:
return self.get_service_account_credentials()
else:
return self.get_user_account_credentials()
# Try to retrieve Application Default Credentials
credentials = self.get_application_default_credentials()
if not credentials:
credentials = self.get_user_account_credentials()
return credentials

def get_application_default_credentials(self):
"""
This method tries to retrieve the "default application credentials".
This could be useful for running code on Google Cloud Platform.

.. versionadded:: 0.19.0

Parameters
----------
None

Returns
-------
- GoogleCredentials,
If the default application credentials can be retrieved
from the environment. The retrieved credentials should also
have access to the project (self.project_id) on BigQuery.
- OR None,
If default application credentials can not be retrieved
from the environment. Or, the retrieved credentials do not
have access to the project (self.project_id) on BigQuery.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a Returns section. Explaining that it will return the credentials or None if not found / error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain the conditions under which this will get the correct credentials

try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these libararies a newer / different version that we are currently importing?

from googleapiclient.discovery import build
except ImportError:
from apiclient.discovery import build
try:
from oauth2client.client import GoogleCredentials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just put all of the imports in 1 try/except block

except ImportError:
return None

try:
credentials = GoogleCredentials.get_application_default()
except:
return None

# Check if the application has rights to the BigQuery project
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line

bigquery_service = build('bigquery', 'v2', credentials=credentials)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be in try for compatibility with google-api-python-client==1.2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parthea I've changed the call to the "build" method - should work for both api versions now.

jobs = bigquery_service.jobs()
job_data = {'configuration': {'query': {'query': 'SELECT 1'}}}
try:
jobs.insert(projectId=self.project_id, body=job_data).execute()
return credentials
except:
return None

def get_user_account_credentials(self):
from oauth2client.client import OAuth2WebServerFlow
Expand Down Expand Up @@ -578,10 +628,16 @@ def read_gbq(query, project_id=None, index_col=None, col_order=None,
https://developers.google.com/api-client-library/python/apis/bigquery/v2

Authentication to the Google BigQuery service is via OAuth 2.0.
By default user account credentials are used. You will be asked to
grant permissions for product name 'pandas GBQ'. It is also posible
to authenticate via service account credentials by using
private_key parameter.
- If "private_key" is not provided:
By default "application default credentials" are used.

.. versionadded:: 0.19.0

If default application credentials are not found or are restrictive,
user account credentials are used. In this case, you will be asked to
grant permissions for product name 'pandas GBQ'.
- If "private_key" is provided:
Service account credentials will be used to authenticate.

Parameters
----------
Expand Down Expand Up @@ -689,10 +745,16 @@ def to_gbq(dataframe, destination_table, project_id, chunksize=10000,
https://developers.google.com/api-client-library/python/apis/bigquery/v2

Authentication to the Google BigQuery service is via OAuth 2.0.
By default user account credentials are used. You will be asked to
grant permissions for product name 'pandas GBQ'. It is also posible
to authenticate via service account credentials by using
private_key parameter.
- If "private_key" is not provided:
By default "application default credentials" are used.

.. versionadded:: 0.19.0

If default application credentials are not found or are restrictive,
user account credentials are used. In this case, you will be asked to
grant permissions for product name 'pandas GBQ'.
- If "private_key" is provided:
Service account credentials will be used to authenticate.

Parameters
----------
Expand Down
36 changes: 36 additions & 0 deletions pandas/io/tests/test_gbq.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,27 @@ def test_requirements():
raise nose.SkipTest(import_exception)


def _check_if_can_get_correct_default_credentials():
# Checks if "Application Default Credentials" can be fetched
# from the environment the tests are running in.
# See Issue #13577
test_requirements()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a 1-liner what this does as a comment (and the issue number)

try:
from googleapiclient.discovery import build
except ImportError:
from apiclient.discovery import build
try:
from oauth2client.client import GoogleCredentials
credentials = GoogleCredentials.get_application_default()
bigquery_service = build('bigquery', 'v2', credentials=credentials)
jobs = bigquery_service.jobs()
job_data = {'configuration': {'query': {'query': 'SELECT 1'}}}
jobs.insert(projectId=PROJECT_ID, body=job_data).execute()
return True
except:
return False


def clean_gbq_environment(private_key=None):
dataset = gbq._Dataset(PROJECT_ID, private_key=private_key)

Expand Down Expand Up @@ -217,6 +238,21 @@ def test_should_be_able_to_get_results_from_query(self):
schema, pages = self.sut.run_query('SELECT 1')
self.assertTrue(pages is not None)

def test_get_application_default_credentials_does_not_throw_error(self):
if _check_if_can_get_correct_default_credentials():
raise nose.SkipTest("Can get default_credentials "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok lgtm.

"from the environment!")
credentials = self.sut.get_application_default_credentials()
self.assertIsNone(credentials)

def test_get_application_default_credentials_returns_credentials(self):
if not _check_if_can_get_correct_default_credentials():
raise nose.SkipTest("Cannot get default_credentials "
"from the environment!")
from oauth2client.client import GoogleCredentials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you need to do skipping based on resource availablity, see the other tests.

credentials = self.sut.get_application_default_credentials()
self.assertTrue(isinstance(credentials, GoogleCredentials))


class TestGBQConnectorServiceAccountKeyPathIntegration(tm.TestCase):
def setUp(self):
Expand Down