Skip to content

In gbq, use googleapiclient instead of apiclient #13458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.18.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -528,3 +528,5 @@ Bug Fixes

- Bug in ``Categorical.remove_unused_categories()`` changes ``.codes`` dtype to platform int (:issue:`13261`)
- Bug in ``groupby`` with ``as_index=False`` returns all NaN's when grouping on multiple columns including a categorical one (:issue:`13204`)

- Bug where ``pd.read_gbq()`` could throw ``ImportError: No module named discovery`` as a result of a naming conflict with another python package called apiclient (:issue:`13454`)
38 changes: 30 additions & 8 deletions pandas/io/gbq.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,12 @@ def _test_google_api_imports():

try:
import httplib2 # noqa
from apiclient.discovery import build # noqa
from apiclient.errors import HttpError # noqa
try:
Copy link
Member

@sinhrks sinhrks Jun 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the detail of GBQ, but does it breaks user codes using apiclient now? Can there be any fallback logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah read your original doc. Pls ignore.

Copy link
Contributor Author

@parthea parthea Jun 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree its better to fallback so that older versions don't break. I've pushed a new version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx. Can you move them under pandas/compat/gbq_compat?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ok here - these r not used anywhere else

from googleapiclient.discovery import build # noqa
from googleapiclient.errors import HttpError # noqa
except:
from apiclient.discovery import build # noqa
from apiclient.errors import HttpError # noqa
from oauth2client.client import AccessTokenRefreshError # noqa
from oauth2client.client import OAuth2WebServerFlow # noqa
from oauth2client.file import Storage # noqa
Expand Down Expand Up @@ -266,7 +270,10 @@ def sizeof_fmt(num, suffix='b'):

def get_service(self):
import httplib2
from apiclient.discovery import build
try:
from googleapiclient.discovery import build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOW the ci should be testing the original one & the new one (in different builds); so at least import checking should work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For python 2.7, we are using google-api-python-client==1.2 . That is already done in requirements-2.7.pip

google-api-python-client version 1.2 doesn't support module googleapiclient (This was before apiclient was deprecated).

In CI, we will test the original one when we run python 2.7, and the new one when we run python 3.4

Copy link
Contributor Author

@parthea parthea Jun 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test to confirm that import googleapiclient raises an import exception when running under python 2.7 (google-api-python-client==1.2)

    def test_import_google_api_python_client(self):
        if compat.PY2:
            with tm.assertRaises(ImportError):
                from googleapiclient.discovery import build  # noqa
                from googleapiclient.errors import HttpError  # noqa
            from apiclient.discovery import build  # noqa
            from apiclient.errors import HttpError  # noqa
        else:
            from googleapiclient.discovery import build  # noqa
            from googleapiclient.errors import HttpError  # noqa

I manually downgraded my google api python client to 1.2 and ran all unit tests locally to confirm that the old code still works using apiclient not googleapiclient.

tony@tonypc:~/parthea-pandas/pandas/io/tests$ nosetests test_gbq.py -v
test_import_google_api_python_client (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_read_gbq_with_corrupted_private_key_json_should_fail (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_read_gbq_with_empty_private_key_file_should_fail (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_read_gbq_with_empty_private_key_json_should_fail (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_read_gbq_with_invalid_private_key_json_should_fail (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_read_gbq_with_no_project_id_given_should_fail (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_read_gbq_with_private_key_json_wrong_types_should_fail (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_should_return_bigquery_booleans_as_python_booleans (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_should_return_bigquery_floats_as_python_floats (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_should_return_bigquery_integers_as_python_floats (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_should_return_bigquery_strings_as_python_strings (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_should_return_bigquery_timestamps_as_numpy_datetime (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_that_parse_data_works_properly (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_to_gbq_should_fail_if_invalid_table_name_passed (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_to_gbq_with_no_project_id_given_should_fail (pandas.io.tests.test_gbq.GBQUnitTests) ... ok
test_should_be_able_to_get_a_bigquery_service (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_results_from_query (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_schema_from_query (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_valid_credentials (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_make_a_connector (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_a_bigquery_service (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration) ... ok
test_should_be_able_to_get_results_from_query (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration) ... ok
test_should_be_able_to_get_schema_from_query (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration) ... ok
test_should_be_able_to_get_valid_credentials (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration) ... ok
test_should_be_able_to_make_a_connector (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration) ... ok
test_should_be_able_to_get_a_bigquery_service (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration) ... ok
test_should_be_able_to_get_results_from_query (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration) ... ok
test_should_be_able_to_get_schema_from_query (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration) ... ok
test_should_be_able_to_get_valid_credentials (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration) ... ok
test_should_be_able_to_make_a_connector (pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration) ... ok
test_bad_project_id (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_bad_table_name (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_column_order (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_column_order_plus_index (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_download_dataset_larger_than_200k_rows (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_index_column (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_malformed_query (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_arbitrary_timestamp (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_empty_strings (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_false_boolean (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_boolean (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_floats (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_integers (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_strings (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_timestamp (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_timestamp_unix_epoch (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_true_boolean (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_valid_floats (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_valid_integers (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_valid_strings (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_read_as_service_account_with_key_contents (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_read_as_service_account_with_key_path (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_unicode_string_conversion_and_normalization (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_zero_rows (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_create_dataset (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_create_table (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_dataset_does_not_exist (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_dataset_exists (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_delete_dataset (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_delete_table (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_generate_schema (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_google_upload_errors_should_raise_exception (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_list_dataset (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_list_table (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_list_table_zero_results (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_table_does_not_exist (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_upload_data (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_upload_data_if_table_exists_append (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_upload_data_if_table_exists_fail (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_upload_data_if_table_exists_replace (pandas.io.tests.test_gbq.TestToGBQIntegration) ... ok
test_upload_data_as_service_account_with_key_contents (pandas.io.tests.test_gbq.TestToGBQIntegrationServiceAccountKeyContents) ... ok
test_upload_data_as_service_account_with_key_path (pandas.io.tests.test_gbq.TestToGBQIntegrationServiceAccountKeyPath) ... ok
pandas.io.tests.test_gbq.test_requirements ... ok
pandas.io.tests.test_gbq.test_generate_bq_schema_deprecated ... ok

----------------------------------------------------------------------
Ran 74 tests in 396.056s

OK

except:
from apiclient.discovery import build

http = httplib2.Http()
http = self.credentials.authorize(http)
Expand Down Expand Up @@ -315,7 +322,10 @@ def process_insert_errors(self, insert_errors):
raise StreamingInsertError

def run_query(self, query):
from apiclient.errors import HttpError
try:
from googleapiclient.errors import HttpError
except:
from apiclient.errors import HttpError
from oauth2client.client import AccessTokenRefreshError

_check_google_client_version()
Expand Down Expand Up @@ -420,7 +430,10 @@ def run_query(self, query):
return schema, result_pages

def load_data(self, dataframe, dataset_id, table_id, chunksize):
from apiclient.errors import HttpError
try:
from googleapiclient.errors import HttpError
except:
from apiclient.errors import HttpError

job_id = uuid.uuid4().hex
rows = []
Expand Down Expand Up @@ -474,7 +487,10 @@ def load_data(self, dataframe, dataset_id, table_id, chunksize):
self._print("\n")

def verify_schema(self, dataset_id, table_id, schema):
from apiclient.errors import HttpError
try:
from googleapiclient.errors import HttpError
except:
from apiclient.errors import HttpError

try:
return (self.service.tables().get(
Expand Down Expand Up @@ -765,7 +781,10 @@ class _Table(GbqConnector):

def __init__(self, project_id, dataset_id, reauth=False, verbose=False,
private_key=None):
from apiclient.errors import HttpError
try:
from googleapiclient.errors import HttpError
except:
from apiclient.errors import HttpError
self.http_error = HttpError
self.dataset_id = dataset_id
super(_Table, self).__init__(project_id, reauth, verbose, private_key)
Expand Down Expand Up @@ -865,7 +884,10 @@ class _Dataset(GbqConnector):

def __init__(self, project_id, reauth=False, verbose=False,
private_key=None):
from apiclient.errors import HttpError
try:
from googleapiclient.errors import HttpError
except:
from apiclient.errors import HttpError
self.http_error = HttpError
super(_Dataset, self).__init__(project_id, reauth, verbose,
private_key)
Expand Down
19 changes: 17 additions & 2 deletions pandas/io/tests/test_gbq.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,12 @@ def _test_imports():

if _SETUPTOOLS_INSTALLED:
try:
from apiclient.discovery import build # noqa
from apiclient.errors import HttpError # noqa
try:
from googleapiclient.discovery import build # noqa
from googleapiclient.errors import HttpError # noqa
except:
from apiclient.discovery import build # noqa
from apiclient.errors import HttpError # noqa

from oauth2client.client import OAuth2WebServerFlow # noqa
from oauth2client.client import AccessTokenRefreshError # noqa
Expand Down Expand Up @@ -280,6 +284,17 @@ class GBQUnitTests(tm.TestCase):
def setUp(self):
test_requirements()

def test_import_google_api_python_client(self):
if compat.PY2:
with tm.assertRaises(ImportError):
from googleapiclient.discovery import build # noqa
from googleapiclient.errors import HttpError # noqa
from apiclient.discovery import build # noqa
from apiclient.errors import HttpError # noqa
else:
from googleapiclient.discovery import build # noqa
from googleapiclient.errors import HttpError # noqa

def test_should_return_bigquery_integers_as_python_floats(self):
result = gbq._parse_entry(1, 'INTEGER')
tm.assert_equal(result, float(1))
Expand Down