Skip to content

Commit e600131

Browse files
Jonathan Chambersjreback
Jonathan Chambers
authored andcommitted
ENH #4163 Added tests and documentation
Initial draft of doc updates minor doc updates Added tests and reduced code repetition. Updated Docs. Added test coverage for legacy names Documentation updates, more tests Added depreciation warnings for legacy names. Updated docs and test doc build ENH #4163 - finalized tests and docs, ready for wider use… TST added sqlalchemy to TravisCI build dep for py 2.7 and 3.3 TST Import sqlalchemy on Travis. DOC add docstrings to read sql ENH read_sql connects via Connection, Engine, file path, or :memory: string CLN Separate legacy code into new file, and fallback so that all old tests pass. ENH #4163 added version added coment ENH #4163 added depreciation warning for tquery and uquery ENH #4163 Documentation and tests
1 parent 74d091f commit e600131

File tree

6 files changed

+398
-1209
lines changed

6 files changed

+398
-1209
lines changed

ci/requirements-2.6.txt

+1
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@ http://www.crummy.com/software/BeautifulSoup/bs4/download/4.2/beautifulsoup4-4.2
66
html5lib==1.0b2
77
bigquery==2.0.17
88
numexpr==1.4.2
9+
sqlalchemy==0.8.1

doc/source/io.rst

+112-51
Original file line numberDiff line numberDiff line change
@@ -1823,7 +1823,7 @@ class. The following two command are equivalent:
18231823
read_excel('path_to_file.xls', 'Sheet1', index_col=None, na_values=['NA'])
18241824
18251825
The class based approach can be used to read multiple sheets or to introspect
1826-
the sheet names using the ``sheet_names`` attribute.
1826+
the sheet names using the ``sheet_names`` attribute.
18271827

18281828
.. note::
18291829

@@ -3068,13 +3068,48 @@ SQL Queries
30683068
-----------
30693069

30703070
The :mod:`pandas.io.sql` module provides a collection of query wrappers to both
3071-
facilitate data retrieval and to reduce dependency on DB-specific API. These
3072-
wrappers only support the Python database adapters which respect the `Python
3073-
DB-API <http://www.python.org/dev/peps/pep-0249/>`__. See some
3074-
:ref:`cookbook examples <cookbook.sql>` for some advanced strategies
3071+
facilitate data retrieval and to reduce dependency on DB-specific API. Database abstraction
3072+
is provided by SQLAlchemy if installed, in addition you will need a driver library for
3073+
your database.
30753074

3076-
For example, suppose you want to query some data with different types from a
3077-
table such as:
3075+
.. versionadded:: 0.14.0
3076+
3077+
3078+
If SQLAlchemy is not installed a legacy fallback is provided for sqlite and mysql.
3079+
These legacy modes require Python database adapters which respect the `Python
3080+
DB-API <http://www.python.org/dev/peps/pep-0249/>`__.
3081+
3082+
See also some :ref:`cookbook examples <cookbook.sql>` for some advanced strategies.
3083+
3084+
The key functions are:
3085+
:func:`~pandas.io.sql.to_sql`
3086+
:func:`~pandas.io.sql.read_sql`
3087+
:func:`~pandas.io.sql.read_table`
3088+
3089+
3090+
3091+
3092+
In the following example, we use the `SQlite <http://www.sqlite.org/>`__ SQL database
3093+
engine. You can use a temporary SQLite database where data are stored in
3094+
"memory".
3095+
3096+
To connect with SQLAlchemy you use the :func:`create_engine` function to create an engine
3097+
object from database URI. You only need to create the engine once per database you are
3098+
connecting to.
3099+
3100+
For more information on :func:`create_engine` and the URI formatting, see the examples
3101+
below and the SQLAlchemy `documentation <http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html>`__
3102+
3103+
.. code-block:: python
3104+
3105+
from sqlalchemy import create_engine
3106+
from pandas.io import sql
3107+
# Create your connection.
3108+
engine = create_engine('sqlite:///:memory:')
3109+
3110+
3111+
Assuming the following data is in a DataFrame "data", we can insert it into
3112+
the database using :func:`~pandas.io.sql.to_sql`.
30783113

30793114

30803115
+-----+------------+-------+-------+-------+
@@ -3088,81 +3123,107 @@ table such as:
30883123
+-----+------------+-------+-------+-------+
30893124

30903125

3091-
Functions from :mod:`pandas.io.sql` can extract some data into a DataFrame. In
3092-
the following example, we use the `SQlite <http://www.sqlite.org/>`__ SQL database
3093-
engine. You can use a temporary SQLite database where data are stored in
3094-
"memory". Just do:
3095-
3096-
.. code-block:: python
3097-
3098-
import sqlite3
3099-
from pandas.io import sql
3100-
# Create your connection.
3101-
cnx = sqlite3.connect(':memory:')
3102-
31033126
.. ipython:: python
31043127
:suppress:
31053128
3106-
import sqlite3
3129+
from sqlalchemy import create_engine
31073130
from pandas.io import sql
3108-
cnx = sqlite3.connect(':memory:')
3131+
engine = create_engine('sqlite:///:memory:')
31093132
31103133
.. ipython:: python
31113134
:suppress:
31123135
3113-
cu = cnx.cursor()
3114-
# Create a table named 'data'.
3115-
cu.execute("""CREATE TABLE data(id integer,
3116-
date date,
3117-
Col_1 string,
3118-
Col_2 float,
3119-
Col_3 bool);""")
3120-
cu.executemany('INSERT INTO data VALUES (?,?,?,?,?)',
3121-
[(26, datetime.datetime(2010,10,18), 'X', 27.5, True),
3122-
(42, datetime.datetime(2010,10,19), 'Y', -12.5, False),
3123-
(63, datetime.datetime(2010,10,20), 'Z', 5.73, True)])
3136+
c = ['id', 'Date', 'Col_1', 'Col_2', 'Col_3']
3137+
d = [(26, datetime.datetime(2010,10,18), 'X', 27.5, True),
3138+
(42, datetime.datetime(2010,10,19), 'Y', -12.5, False),
3139+
(63, datetime.datetime(2010,10,20), 'Z', 5.73, True)]
3140+
3141+
data = DataFrame(d, columns=c)
3142+
3143+
.. ipython:: python
31243144
3145+
sql.to_sql(data, 'data', engine)
31253146
3126-
Let ``data`` be the name of your SQL table. With a query and your database
3127-
connection, just use the :func:`~pandas.io.sql.read_sql` function to get the
3128-
query results into a DataFrame:
3147+
You can read from the database simply by
3148+
specifying a table name using the :func:`~pandas.io.sql.read_table` function.
31293149

31303150
.. ipython:: python
31313151
3132-
sql.read_sql("SELECT * FROM data;", cnx)
3152+
sql.read_table('data', engine)
31333153
31343154
You can also specify the name of the column as the DataFrame index:
31353155

31363156
.. ipython:: python
31373157
3138-
sql.read_sql("SELECT * FROM data;", cnx, index_col='id')
3139-
sql.read_sql("SELECT * FROM data;", cnx, index_col='date')
3158+
sql.read_table('data', engine, index_col='id')
31403159
3141-
Of course, you can specify a more "complex" query.
3160+
You can also query using raw SQL in the :func:`~pandas.io.sql.read_sql` function.
31423161

31433162
.. ipython:: python
31443163
3145-
sql.read_sql("SELECT id, Col_1, Col_2 FROM data WHERE id = 42;", cnx)
3164+
sql.read_sql('SELECT * FROM data', engine)
3165+
3166+
Of course, you can specify a more "complex" query.
31463167

31473168
.. ipython:: python
3148-
:suppress:
31493169
3150-
cu.close()
3151-
cnx.close()
3170+
sql.read_frame("SELECT id, Col_1, Col_2 FROM data WHERE id = 42;", engine)
31523171
31533172
31543173
There are a few other available functions:
31553174

3156-
- ``tquery`` returns a list of tuples corresponding to each row.
3157-
- ``uquery`` does the same thing as tquery, but instead of returning results
3158-
it returns the number of related rows.
3159-
- ``write_frame`` writes records stored in a DataFrame into the SQL table.
3160-
- ``has_table`` checks if a given SQLite table exists.
3175+
:func:`~pandas.io.sql.has_table` checks if a given table exists.
31613176

3162-
.. note::
3177+
:func:`~pandas.io.sql.tquery` returns a list of tuples corresponding to each row.
3178+
3179+
:func:`~pandas.io.sql.uquery` does the same thing as tquery, but instead of
3180+
returning results it returns the number of related rows.
3181+
3182+
In addition, the class :class:`~pandas.io.sql.PandasSQLWithEngine` can be
3183+
instantiated directly for more manual control over the SQL interaction.
3184+
3185+
Engine connection examples
3186+
~~~~~~~~~~~~~~~~~~~~~~~~~~
3187+
3188+
.. code-block:: python
3189+
3190+
from sqlalchemy import create_engine
3191+
3192+
engine = create_engine('postgresql://scott:tiger@localhost:5432/mydatabase')
3193+
3194+
engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')
3195+
3196+
engine = create_engine('oracle://scott:[email protected]:1521/sidname')
3197+
3198+
engine = create_engine('mssql+pyodbc://mydsn')
3199+
3200+
# sqlite://<nohostname>/<path>
3201+
# where <path> is relative:
3202+
engine = create_engine('sqlite:///foo.db')
3203+
3204+
# or absolute, starting with a slash:
3205+
engine = create_engine('sqlite:////absolute/path/to/foo.db')
3206+
3207+
3208+
Legacy
3209+
~~~~~~
3210+
To use the sqlite support without SQLAlchemy, you can create connections like so:
3211+
3212+
.. code-block:: python
3213+
3214+
import sqlite3
3215+
from pandas.io import sql
3216+
cnx = sqlite3.connect(':memory:')
3217+
3218+
And then issue the following queries, remembering to also specify the flavor of SQL
3219+
you are using.
3220+
3221+
.. code-block:: python
3222+
3223+
sql.to_sql(data, 'data', cnx, flavor='sqlite')
3224+
3225+
sql.read_sql("SELECT * FROM data", cnx, flavor='sqlite')
31633226
3164-
For now, writing your DataFrame into a database works only with
3165-
**SQLite**. Moreover, the **index** will currently be **dropped**.
31663227
31673228
.. _io.bigquery:
31683229

0 commit comments

Comments
 (0)