Skip to content

Commit 9dd675b

Browse files
benoitpointetjreback
authored andcommitted
DOC GH3508 (bis) added basic documentation of google analytics in remote_data
1 parent 9e45acd commit 9dd675b

File tree

2 files changed

+63
-7
lines changed

2 files changed

+63
-7
lines changed

doc/source/remote_data.rst

+62-7
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,14 @@ Remote Data Access
2727

2828
.. _remote_data.data_reader:
2929

30-
Functions from :mod:`pandas.io.data` extract data from various Internet
31-
sources into a DataFrame. Currently the following sources are supported:
30+
Functions from :mod:`pandas.io.data` and :mod:`pandas.io.ga` extract data from various Internet sources into a DataFrame. Currently the following sources are supported:
3231

33-
- Yahoo! Finance
34-
- Google Finance
35-
- St. Louis FED (FRED)
36-
- Kenneth French's data library
37-
- World Bank
32+
- :ref:`Yahoo! Finance<remote_data.yahoo>`
33+
- :ref:`Google Finance<remote_data.google>`
34+
- :ref:`St.Louis FED (FRED)<remote_data.fred>`
35+
- :ref:`Kenneth French's data library<remote_data.ff>`
36+
- :ref:`World Bank<remote_data.wb>`
37+
- :ref:`Google Analytics<remote_data.ga>`
3838

3939
It should be noted, that various sources support different kinds of data, so not all sources implement the same methods and the data elements returned might also differ.
4040

@@ -330,7 +330,62 @@ indicators, or a single "bad" (#4 above) country code).
330330
331331
See docstrings for more info.
332332
333+
.. _remote_data.ga:
333334
335+
Google Analytics
336+
----------------
334337
338+
The :mod:`~pandas.io.ga` module provides a wrapper for
339+
`Google Analytics API <https://developers.google.com/analytics/devguides>`__
340+
to simplify retrieving traffic data.
341+
Result sets are parsed into a pandas DataFrame with a shape and data types
342+
derived from the source table.
335343
344+
Configuring Access to Google Analytics
345+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
346+
347+
The first thing you need to do is to setup accesses to Google Analytics API. Follow the steps below:
348+
349+
#. In the `Google Developers Console <https://console.developers.google.com>`__
350+
#. enable the Analytics API
351+
#. create a new project
352+
#. create a new Client ID for an "Installed Application" (in the "APIs & auth / Credentials section" of the newly created project)
353+
#. download it (JSON file)
354+
#. On your machine
355+
#. rename it to ``client_secrets.json``
356+
#. move it to the ``pandas/io`` module directory
357+
358+
The first time you use the :func:`read_ga` funtion, a browser window will open to ask you to authentify to the Google API. Do proceed.
359+
360+
Using the Google Analytics API
361+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
362+
363+
The following will fetch users and pageviews (metrics) data per day of the week, for the first semester of 2014, from a particular property.
364+
365+
.. code-block:: python
366+
367+
import pandas.io.ga as ga
368+
ga.read_ga(
369+
account_id = "2360420",
370+
profile_id = "19462946",
371+
property_id = "UA-2360420-5",
372+
metrics = ['users', 'pageviews'],
373+
dimensions = ['dayOfWeek'],
374+
start_date = "2014-01-01",
375+
end_date = "2014-08-01",
376+
index_col = 0,
377+
filters = "pagePath=~aboutus;ga:country==France",
378+
)
379+
380+
The only mandatory arguments are ``metrics,`` ``dimensions`` and ``start_date``. We can only strongly recommend you to always specify the ``account_id``, ``profile_id`` and ``property_id`` to avoid accessing the wrong data bucket in Google Analytics.
381+
382+
The ``index_col`` argument indicates which dimension(s) has to be taken as index.
383+
384+
The ``filters`` argument indicates the filtering to apply to the query. In the above example, the page has URL has to contain ``aboutus`` AND the visitors country has to be France.
385+
386+
Detailed informations in the followings:
387+
388+
* `pandas & google analytics, by yhat <http://blog.yhathq.com/posts/pandas-google-analytics.html>`__
389+
* `Google Analytics integration in pandas, by Chang She <http://quantabee.wordpress.com/2012/12/17/google-analytics-pandas/>`__
390+
* `Google Analytics Dimensions and Metrics Reference <https://developers.google.com/analytics/devguides/reporting/core/dimsmets>`_
336391

doc/source/whatsnew/v0.15.2.txt

+1
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ Enhancements
4444
- Added ability to export Categorical data to Stata (:issue:`8633`).
4545
- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
4646
- Added support for ``utcfromtimestamp()``, ``fromtimestamp()``, and ``combine()`` on `Timestamp` class (:issue:`5351`).
47+
- Added Google Analytics (`pandas.io.ga`) basic documentation (:issue:`8835`). See :ref:`here<remote_data.ga>`.
4748

4849
.. _whatsnew_0152.performance:
4950

0 commit comments

Comments
 (0)