Skip to content

BUG: Pagination in pandas.io.gbq #5262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 21, 2013
Merged

Conversation

jacobschaer
Copy link
Contributor

In light of some last minute API changes in Google BigQuery, I have updated our code to function properly. In particular, this fixes a known bug that limited result sets to 10,000. Hopefully we'll have an entirely new version soon that will make better use of Google's reference code and thus be more future proof. Note that bigquery v2.0.17 is new as of today... 2.0.16 tested fine, but they fixed a few important backend things so we went ahead and made the change mandatory in light of the pandas release candidate.

See:
closes #5255

@jtratner
Copy link
Contributor

Looks fine to me - I'll leave it open for a bit so others can comment then merge tomorrow.

@jreback
Copy link
Contributor

jreback commented Oct 20, 2013

@jacobschaer can you rebase to a single commit? and just confirm that it tests ok locally as well

@jacobschaer
Copy link
Contributor Author

Alright, I merged them all together and ran the test suite a final time. Below is the results (snipped out the lengthy job notifications).

nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
Dataset '57288129629:pandas_testing_dataset' successfully created.
test_column_order (__main__.test_gbq) ... ok
test_column_order_plus_index (__main__.test_gbq) ... ok
test_data_small (__main__.test_gbq) ... ok
Waiting on bqjob_r372b1f5a1fd28f12_00000141d7f6d37b_1 ...  [...]
ok
Waiting on bqjob_r4221620eddca03fb_00000141d7f6dc05_2 ... [...] 
ok
test_index_column (__main__.test_gbq) ... ok
test_invalid_column_name_schema (__main__.test_gbq) ... ok
test_invalid_number_of_columns_schema (__main__.test_gbq) ... ok
test_malformed_query (__main__.test_gbq) ... ok
test_table__not_exists (__main__.test_gbq) ... ok
test_table_exists (__main__.test_gbq) ... ok
test_type_conversion (__main__.test_gbq) ... ok
Waiting on bqjob_r5216d5c5efd24d48_00000141d7f7204e_4 ... [...]
ok
Waiting on bqjob_r346f9f6705a780a2_00000141d7f88358_5 ... [...]
ok
Waiting on bqjob_r45989cba2dd985ce_00000141d7fa4336_6 ... [...]
Waiting on bqjob_r2d3f37e62f4db39d_00000141d7fb551a_7 ... [...]
ok
test_upload_new_table_schema_error (__main__.test_gbq) ... ok
test_upload_public_data_error (__main__.test_gbq) ... ok
Waiting on bqjob_r9ae4ef072f7d9df_00000141d7fb6761_9 ... [...]
ok
test_upload_replace_schema_error (__main__.test_gbq) ... ok
test_valid_authentication (__main__.test_gbq) ... ok

----------------------------------------------------------------------
Ran 20 tests in 452.252s

OK

@jreback
Copy link
Contributor

jreback commented Oct 20, 2013

do u now require a minimum version of big query? if so I would check when u import and raise with a helpful message (to get a min version)

@jacobschaer
Copy link
Contributor Author

@jreback - We do require bq.py v2.0.17 (though from testing 2.0.16 seems to work ok). Do you have any recommendations on how to check for a minimum version?

@jtratner
Copy link
Contributor

Use LooseVersion (imported from distutils) - pretty sure
LooseVersion(bq.version) >= '2.0.17'

Should work

@jreback
Copy link
Contributor

jreback commented Oct 20, 2013

https://github.com/pydata/pandas/blob/master/pandas/util/print_versions.py

see at the bottom for how to get the bq versions

https://github.com/pydata/pandas/wiki/Tips-&-Tricks

for how to use loose version to check

but only put in a check if it makes a difference in what u r doing - eg you need to do different things depending in versions or certain things not supported

@jacobschaer
Copy link
Contributor Author

I'll take a look at it... I'm using a constant from the file that does require at least 2.0.16... 2.0.17 was released do close, we might as well make it the minimum just in case something comes up in the near future.

@jacobschaer
Copy link
Contributor Author

Version checking no in place.

@jreback
Copy link
Contributor

jreback commented Oct 21, 2013

gr8....if you want to have travis test on an older version you can as well (not sure if that is useful).....no biggie

@jacobschaer
Copy link
Contributor Author

I tested it locally by downgrading... not sure if it needs to be in Travis for this revision, but I'll definitely consider it for the next one.

@jreback
Copy link
Contributor

jreback commented Oct 21, 2013

that's fine....ok...bombs away

jreback added a commit that referenced this pull request Oct 21, 2013
BUG: Pagination in pandas.io.gbq
@jreback jreback merged commit 67146bc into pandas-dev:master Oct 21, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Google BigQuery API Change
3 participants