pandas.io.gbq.read_gbq() returns incorrect results

When using the `read_gbq()` function on a BigQuery table, incorrect results are returned.

I compare the output from `read_gbq()` to that of a CSV export from BigQuery directly. Interestingly, there are the same number of rows in each output - however, there are many duplicates in the `read_gbq()` output.

I'm using Pandas '0.13.0rc1-125-g4952858' on a Mac 10.9 using Python 2.7. Numpy '1.8.0'.

The code I execute to load the data in pandas:
`churn_data = gbq.read_gbq(train_query, project_id = projectid)`

I can't share the underlying data. What additional data/info would be useful for root causing?

The output data is ~400k lines.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

pandas.io.gbq.read_gbq() returns incorrect results #5840

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

pandas.io.gbq.read_gbq() returns incorrect results #5840

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions