BUG: Unhandled ValueError when Bigquery called through io.gbq returns zero rows #10273 #10274

ssaumitra · 2015-06-04T14:30:17Z

jreback · 2015-06-04T15:42:51Z

pandas/io/tests/test_gbq.py

@@ -296,6 +296,12 @@ def test_download_dataset_larger_than_200k_rows(self):
        df = gbq.read_gbq("SELECT id FROM [publicdata:samples.wikipedia] GROUP EACH BY id ORDER BY id ASC LIMIT 200005", project_id=PROJECT_ID)
        self.assertEqual(len(df.drop_duplicates()), 200005)

+    def test_zero_rows(self):
+        df = gbq.read_gbq("SELECT * FROM [publicdata:samples.wikipedia] where timestamp=-9999999", project_id=PROJECT_ID)


pls add the issue number here

compare the resultant dataframe with a constructed one and use
assert_frame_equal(result,expected)

ssaumitra · 2015-06-04T15:55:40Z

Issue number in form of comment? Also, why is

assert_frame_equal

better for checking empty dataframe? I am asking just for my understanding.

jreback · 2015-06-04T16:00:07Z

oh, because the frame is not exempty, it has column names (or should have), and an index (which has 0-len). You are guaranteeing a certain return type and meta-data to the user (e.g. a query that returns rows has these things, so an empty one should as well).

ssaumitra · 2015-06-04T16:26:02Z

OK. Here it is.

jreback · 2015-06-04T16:28:18Z

ok, pls add a release note in the whatsnew for 0.16.2.

pls squash as well.

ssaumitra · 2015-06-04T16:39:38Z

Can you check-in the base template for doc/source/whatsnew/v0.16.2.txt please? Or should I copy it from 5ebf521 ?

jreback · 2015-06-04T16:53:51Z

rebase on master. Its already there.

ssaumitra · 2015-06-04T23:45:24Z

Added release note. Integrated with master.

jreback · 2015-06-05T13:14:43Z

See contributing docs here

pls squash.

ssaumitra · 2015-06-05T15:04:23Z

I am checking-in Squashed commit in few minutes.

What about docs? I checked in release note in dadf5c2. Is it missing something?
I read the documentation you mentioned above but could not understand specific problem.

ssaumitra · 2015-06-05T15:17:22Z

Squashed commit submitted.

jreback · 2015-06-05T16:18:06Z

pandas/io/gbq.py

@@ -279,7 +279,7 @@ def _parse_data(schema, rows):
                                       field_type)
            page_array[row_num][col_num] = field_value

-    return DataFrame(page_array)
+    return DataFrame(page_array, columns=col_names)


this should not be necessary, page_array is already a record array (this will just reindex it) (and copy it).

jreback · 2015-06-05T21:49:09Z

cc @jacobschaer

can you test this out and lmk?

jreback · 2015-06-09T10:36:47Z

cc @jacobschaer
cc @sean-schaefer

jreback · 2015-06-10T10:43:38Z

@ssaumitra

can you show

nosetests pandas/io/tests/test_gbq.py -v

on your system (as this is not testsed with actual credentials on travis).

ssaumitra · 2015-06-11T15:14:09Z

@jreback I am away from work. I will upload the output next week as soon as I can.

ssaumitra · 2015-06-15T15:27:32Z

The test output is as follows. I am also updating documentation file to mark changes in 0.17, not in 0.16.2.

$ nosetests pandas/io/tests/test_gbq.py -v
test_should_be_able_to_get_a_bigquery_service (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_results_from_query (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_schema_from_query (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_valid_credentials (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_make_a_connector (pandas.io.tests.test_gbq.TestGBQConnectorIntegration) ... ok
test_bad_project_id (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_bad_table_name (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_column_order (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_column_order_plus_index (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_download_dataset_larger_than_200k_rows (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_index_column (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_malformed_query (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_arbitrary_timestamp (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_empty_strings (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_false_boolean (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_boolean (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_floats (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_integers (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_strings (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_timestamp (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_timestamp_unix_epoch (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_true_boolean (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_valid_floats (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_valid_integers (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_should_properly_handle_valid_strings (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_unicode_string_conversion_and_normalization (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_zero_rows (pandas.io.tests.test_gbq.TestReadGBQIntegration) ... ok
test_read_gbq_with_no_project_id_given_should_fail (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_booleans_as_python_booleans (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_floats_as_python_floats (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_integers_as_python_floats (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_strings_as_python_strings (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_timestamps_as_numpy_datetime (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_that_parse_data_works_properly (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_to_gbq_should_fail_if_invalid_table_name_passed (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_to_gbq_with_no_project_id_given_should_fail (pandas.io.tests.test_gbq.TestReadGBQUnitTests) ... ok
test_generate_bq_schema (pandas.io.tests.test_gbq.TestToGBQIntegration) ... SKIP: Cannot run to_gbq tests without bq command line client
test_google_upload_errors_should_raise_exception (pandas.io.tests.test_gbq.TestToGBQIntegration) ... SKIP: Cannot run to_gbq tests without bq command line client
test_upload_data (pandas.io.tests.test_gbq.TestToGBQIntegration) ... SKIP: Cannot run to_gbq tests without bq command line client
pandas.io.tests.test_gbq.test_requirements ... ok

----------------------------------------------------------------------
Ran 40 tests in 51.413s

OK (SKIP=3)

ssaumitra · 2015-06-17T13:09:28Z

@jreback Any news?

ssaumitra · 2015-07-10T15:45:03Z

@jreback I have merged latest changes. Please let me know whether any changes are needed in this commit. I am available to make changes in next week. But I would be away few weeks after that.

jreback · 2015-08-15T23:59:53Z

cc @jacobschaer
cc @sean-schaefer

@ssaumitra can you rebase.

ssaumitra · 2015-08-17T09:16:50Z

@jreback rebase done.

jreback · 2015-08-17T11:08:46Z

doc/source/whatsnew/v0.17.0.txt

@@ -669,3 +668,4 @@ Bug Fixes
 - Bug in ``PeriodIndex.order`` reset freq (:issue:`10295`)
 - Bug in ``iloc`` allowing memory outside bounds of a Series to be accessed with negative integers (:issue:`10779`)
 - Bug preventing access to the first index when using ``iloc`` with a list containing the appropriate negative integer (:issue:`10547`, :issue:`10779`)
+- Bug where ``io.gbq`` throws ValueError when Bigquery returns zero rows (:issue:`10273`)


use double backticks around ValueError. say pd.read_gbq instead of Bigquery

IMO, following would be the better replacement

Bug where ``pd.read_gbq`` throws ``ValueError`` when Bigquery returns zero rows (:issue:`10273`)

because exception is thrown when Google Bigquery REST API returns zero rows, not the pandas function pd.read_gbq.
Does that look good?

when they use pandas they use pd.read_gbq. using Bigquery in a release note about this is not obvious to the casual reader. but that is fine.

OK, then adding the line as per my last comment.

ssaumitra · 2015-08-20T14:43:58Z

@jreback @sean-schaefer I will be away from work from next week. So I won't be able to respond. It would be great if we can conclude it this week. Is any more input required from my side here?

jreback · 2015-08-20T14:51:14Z

this looks fine. Ideally i'd like

cc @jacobschaer
cc @sean-schaefer

to give a test

jacobschaer · 2015-08-28T23:34:58Z

Looked fine, thanks. All tests passed when I ran it:

Successfully installed numpy pandas
Cleaning up...
+ pip freeze
Cython==0.23.1
argparse==1.2.1
bigquery==2.0.17
ez-setup==0.9
google-api-python-client==1.2
google-apputils==0.4.2
httplib2==0.9.1
nose==1.3.7
numpy==1.9.2
oauth2client==1.2
-e git+https://github.com/ssaumitra/pandas.git@cf6025e6ccd2a4bf79fe0b85e852bc3fe0ef50ff#egg=pandas-origin/bugfix-bigquery
pyasn1==0.1.8
pyasn1-modules==0.0.7
python-dateutil==2.4.2
python-gflags==2.0
pytz==2015.4
rsa==3.2
simplejson==3.8.0
six==1.9.0
uritemplate==0.6
wsgiref==0.1.2
+ python pandas/io/tests/test_gbq.py

nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
test_should_be_able_to_get_a_bigquery_service (__main__.TestGBQConnectorIntegration) ... ok

test_should_be_able_to_get_results_from_query (__main__.TestGBQConnectorIntegration) ... ok

test_should_be_able_to_get_schema_from_query (__main__.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_get_valid_credentials (__main__.TestGBQConnectorIntegration) ... ok
test_should_be_able_to_make_a_connector (__main__.TestGBQConnectorIntegration) ... ok

test_bad_project_id (__main__.TestReadGBQIntegration) ... ok
test_bad_table_name (__main__.TestReadGBQIntegration) ... ok

test_column_order (__main__.TestReadGBQIntegration) ... ok

test_column_order_plus_index (__main__.TestReadGBQIntegration) ... ok

test_download_dataset_larger_than_200k_rows (__main__.TestReadGBQIntegration) ... ok

test_index_column (__main__.TestReadGBQIntegration) ... ok
test_malformed_query (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_arbitrary_timestamp (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_empty_strings (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_false_boolean (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_null_boolean (__main__.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_floats (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_null_integers (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_null_strings (__main__.TestReadGBQIntegration) ... ok
test_should_properly_handle_null_timestamp (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_timestamp_unix_epoch (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_true_boolean (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_valid_floats (__main__.TestReadGBQIntegration) ... ok
test_should_properly_handle_valid_integers (__main__.TestReadGBQIntegration) ... ok

test_should_properly_handle_valid_strings (__main__.TestReadGBQIntegration) ... ok

test_unicode_string_conversion_and_normalization (__main__.TestReadGBQIntegration) ... ok

test_zero_rows (__main__.TestReadGBQIntegration) ... ok
test_read_gbq_with_no_project_id_given_should_fail (__main__.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_booleans_as_python_booleans (__main__.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_floats_as_python_floats (__main__.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_integers_as_python_floats (__main__.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_strings_as_python_strings (__main__.TestReadGBQUnitTests) ... ok
test_should_return_bigquery_timestamps_as_numpy_datetime (__main__.TestReadGBQUnitTests) ... ok
test_that_parse_data_works_properly (__main__.TestReadGBQUnitTests) ... ok
test_to_gbq_should_fail_if_invalid_table_name_passed (__main__.TestReadGBQUnitTests) ... ok
test_to_gbq_with_no_project_id_given_should_fail (__main__.TestReadGBQUnitTests) ... ok

Dataset 'serene-epsilon-769:pydata_pandas_bq_testing' successfully created.

Table 'serene-epsilon-769:pydata_pandas_bq_testing.new_test' successfully created.
test_generate_bq_schema (__main__.TestToGBQIntegration) ... 
ok
test_google_upload_errors_should_raise_exception (__main__.TestToGBQIntegration) ...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...
Job not yet complete...



Streaming Insert is 100% Completeok
test_upload_data (__main__.TestToGBQIntegration) ...
__main__.test_requirements ... ok

----------------------------------------------------------------------
Ran 40 tests in 565.830s

OK
Job not yet complete...
[Output Exception BugFix] $ /bin/sh -xe /tmp/hudson8412146074501557652.sh
+ cp bigquery_credentials.dat /var/lib/jenkins/bigquery_credentials.dat
Finished: SUCCESS

jreback · 2015-08-29T00:04:27Z

@jacobschaer gr8 thanks!

jreback · 2015-08-29T00:08:16Z

merged via 53a6830

thanks!

ssaumitra · 2015-09-02T10:41:49Z

Thanks all :)

ssaumitra mentioned this pull request Jun 4, 2015

BUG: Unhandled ValueError when Bigquery called through io.gbq returns zero rows #10273

Closed

jreback reviewed Jun 4, 2015
View reviewed changes

jreback added Bug Google I/O labels Jun 4, 2015

jreback added this to the 0.16.2 milestone Jun 4, 2015

jreback reviewed Jun 5, 2015
View reviewed changes

jreback modified the milestones: 0.16.2, 0.17.0 Jun 11, 2015

jreback reviewed Aug 17, 2015
View reviewed changes

BUG: Pandas throws exception if Google Bigquery output is empty.

cf6025e

jreback pushed a commit that referenced this pull request Aug 29, 2015

BUG: Pandas throws exception if Google Bigquery output is empty, #10274

53a6830

jreback closed this Aug 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Unhandled ValueError when Bigquery called through io.gbq returns zero rows #10273 #10274

BUG: Unhandled ValueError when Bigquery called through io.gbq returns zero rows #10273 #10274

ssaumitra commented Jun 4, 2015

jreback Jun 4, 2015

ssaumitra commented Jun 4, 2015

jreback commented Jun 4, 2015

ssaumitra commented Jun 4, 2015

jreback commented Jun 4, 2015

ssaumitra commented Jun 4, 2015

jreback commented Jun 4, 2015

ssaumitra commented Jun 4, 2015

jreback commented Jun 5, 2015

ssaumitra commented Jun 5, 2015

ssaumitra commented Jun 5, 2015

jreback Jun 5, 2015

jreback commented Jun 5, 2015

jreback commented Jun 9, 2015

jreback commented Jun 10, 2015

ssaumitra commented Jun 11, 2015

ssaumitra commented Jun 15, 2015

ssaumitra commented Jun 17, 2015

ssaumitra commented Jul 10, 2015

jreback commented Aug 15, 2015

ssaumitra commented Aug 17, 2015

jreback Aug 17, 2015

ssaumitra Aug 17, 2015

jreback Aug 17, 2015

ssaumitra Aug 17, 2015

ssaumitra commented Aug 20, 2015

jreback commented Aug 20, 2015

jacobschaer commented Aug 28, 2015

jreback commented Aug 29, 2015

jreback commented Aug 29, 2015

ssaumitra commented Sep 2, 2015

BUG: Unhandled ValueError when Bigquery called through io.gbq returns zero rows #10273 #10274

BUG: Unhandled ValueError when Bigquery called through io.gbq returns zero rows #10273 #10274

Conversation

ssaumitra commented Jun 4, 2015

jreback Jun 4, 2015

Choose a reason for hiding this comment

ssaumitra commented Jun 4, 2015

jreback commented Jun 4, 2015

ssaumitra commented Jun 4, 2015

jreback commented Jun 4, 2015

ssaumitra commented Jun 4, 2015

jreback commented Jun 4, 2015

ssaumitra commented Jun 4, 2015

jreback commented Jun 5, 2015

ssaumitra commented Jun 5, 2015

ssaumitra commented Jun 5, 2015

jreback Jun 5, 2015

Choose a reason for hiding this comment

jreback commented Jun 5, 2015

jreback commented Jun 9, 2015

jreback commented Jun 10, 2015

ssaumitra commented Jun 11, 2015

ssaumitra commented Jun 15, 2015

ssaumitra commented Jun 17, 2015

ssaumitra commented Jul 10, 2015

jreback commented Aug 15, 2015

ssaumitra commented Aug 17, 2015

jreback Aug 17, 2015

Choose a reason for hiding this comment

ssaumitra Aug 17, 2015

Choose a reason for hiding this comment

jreback Aug 17, 2015

Choose a reason for hiding this comment

ssaumitra Aug 17, 2015

Choose a reason for hiding this comment

ssaumitra commented Aug 20, 2015

jreback commented Aug 20, 2015

jacobschaer commented Aug 28, 2015

jreback commented Aug 29, 2015

jreback commented Aug 29, 2015

ssaumitra commented Sep 2, 2015