Skip to content

DOC: Prepare for 0.5.0 release #188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 15, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 12 additions & 4 deletions docs/source/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,23 +1,31 @@
Changelog
=========

0.5.0 / TBD
-----------
0.5.0 / 2018-06-15
------------------

- Project ID parameter is optional in ``read_gbq`` and ``to_gbq`` when it can
inferred from the environment. Note: you must still pass in a project ID when
using user-based authentication. (:issue:`103`)
- Add location parameter to ``read_gbq`` and ``to_gbq`` so that pandas-gbq
can work with datasets in the Tokyo region. (:issue:`177`)
- Progress bar added for ``to_gbq``, through an optional library `tqdm` as
dependency. (:issue:`162`)
- Add location parameter to ``read_gbq`` and ``to_gbq`` so that pandas-gbq
can work with datasets in the Tokyo region. (:issue:`177`)

Documentation
~~~~~~~~~~~~~

- Add :doc:`authentication how-to guide <howto/authentication>`. (:issue:`183`)
- Update :doc:`contributing` guide with new paths to tests. (:issue:`154`,
:issue:`164`)

Internal changes
~~~~~~~~~~~~~~~~

- Tests now use `nox` to run in multiple Python environments. (:issue:`52`)
- Renamed internal modules. (:issue:`154`)
- Refactored auth to an internal auth module. (:issue:`176`)
- Add unit tests for ``get_credentials()``. (:issue:`184`)

0.4.1 / 2018-04-05
------------------
Expand Down
14 changes: 8 additions & 6 deletions docs/source/reading.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
Reading Tables
==============

Suppose you want to load all data from an existing BigQuery table : `test_dataset.test_table`
into a DataFrame using the :func:`~read_gbq` function.
Suppose you want to load all data from an existing BigQuery table
``test_dataset.test_table`` into a DataFrame using the
:func:`~pandas_gbq.read_gbq` function.

.. code-block:: python

Expand All @@ -25,9 +26,9 @@ destination DataFrame as well as a preferred column order as follows:
col_order=['col1', 'col2', 'col3'], projectid)


You can specify the query config as parameter to use additional options of your job.
For more information about query configuration parameters see
`here <https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query>`__.
You can specify the query config as parameter to use additional options of
your job. For more information about query configuration parameters see `here
<https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query>`__.

.. code-block:: python

Expand All @@ -42,7 +43,8 @@ For more information about query configuration parameters see

.. note::

You can find your project id in the `Google developers console <https://console.developers.google.com>`__.
You can find your project id in the `Google developers console
<https://console.developers.google.com>`__.


.. note::
Expand Down
34 changes: 13 additions & 21 deletions docs/source/writing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
Writing DataFrames
==================

Assume we want to write a DataFrame ``df`` into a BigQuery table using :func:`~to_gbq`.
Assume we want to write a DataFrame ``df`` into a BigQuery table using
:func:`~pandas_gbq.to_gbq`.

.. ipython:: python

Expand Down Expand Up @@ -38,21 +39,10 @@ a ``TableCreationError`` if the destination table already exists.

.. note::

If the ``if_exists`` argument is set to ``'append'``, the destination dataframe will
be written to the table using the defined table schema and column types. The
dataframe must contain fields (matching name and type) currently in the destination table.
If the ``if_exists`` argument is set to ``'replace'``, and the existing table has a
different schema, a delay of 2 minutes will be forced to ensure that the new schema
has propagated in the Google environment. See
`Google BigQuery issue 191 <https://code.google.com/p/google-bigquery/issues/detail?id=191>`__.

Writing large DataFrames can result in errors due to size limitations being exceeded.
This can be avoided by setting the ``chunksize`` argument when calling :func:`~to_gbq`.
For example, the following writes ``df`` to a BigQuery table in batches of 10000 rows at a time:

.. code-block:: python

to_gbq(df, 'my_dataset.my_table', projectid, chunksize=10000)
If the ``if_exists`` argument is set to ``'append'``, the destination
dataframe will be written to the table using the defined table schema and
column types. The dataframe must contain fields (matching name and type)
currently in the destination table.

.. note::

Expand All @@ -66,8 +56,10 @@ For example, the following writes ``df`` to a BigQuery table in batches of 10000

.. note::

While BigQuery uses SQL-like syntax, it has some important differences from traditional
databases both in functionality, API limitations (size and quantity of queries or uploads),
and how Google charges for use of the service. You should refer to `Google BigQuery documentation <https://cloud.google.com/bigquery/what-is-bigquery>`__
often as the service seems to be changing and evolving. BiqQuery is best for analyzing large
sets of data quickly, but it is not a direct replacement for a transactional database.
While BigQuery uses SQL-like syntax, it has some important differences
from traditional databases both in functionality, API limitations (size
and quantity of queries or uploads), and how Google charges for use of the
service. You should refer to `Google BigQuery documentation
<https://cloud.google.com/bigquery/docs>`__ often as the service is always
evolving. BiqQuery is best for analyzing large sets of data quickly, but
it is not a direct replacement for a transactional database.
14 changes: 8 additions & 6 deletions pandas_gbq/gbq.py
Original file line number Diff line number Diff line change
Expand Up @@ -524,9 +524,10 @@ def read_gbq(query, project_id=None, index_col=None, col_order=None,
<https://cloud.google.com/bigquery/sql-reference/>`__
location : str (optional)
Location where the query job should run. See the `BigQuery locations
<https://cloud.google.com/bigquery/docs/dataset-locations>
documentation`__ for a list of available locations. The location must
match that of any datasets used in the query.
documentation
<https://cloud.google.com/bigquery/docs/dataset-locations>`__ for a
list of available locations. The location must match that of any
datasets used in the query.
.. versionadded:: 0.5.0
configuration : dict (optional)
Query config parameters for job processing.
Expand Down Expand Up @@ -659,9 +660,10 @@ def to_gbq(dataframe, destination_table, project_id=None, chunksize=None,
.. versionadded:: 0.3.1
location : str (optional)
Location where the load job should run. See the `BigQuery locations
<https://cloud.google.com/bigquery/docs/dataset-locations>
documentation`__ for a list of available locations. The location must
match that of the target dataset.
documentation
<https://cloud.google.com/bigquery/docs/dataset-locations>`__ for a
list of available locations. The location must match that of the
target dataset.
.. versionadded:: 0.5.0
progress_bar : boolean, True by default. It uses the library `tqdm` to show
the progress bar for the upload, chunk by chunk.
Expand Down