pandas-dev · jreback · Aug 14, 2015 · Mar 23, 2015 · Aug 14, 2015 · jreback
diff --git a/doc/source/api.rst b/doc/source/api.rst
@@ -82,6 +82,15 @@ HDFStore: PyTables (HDF5)
    HDFStore.get
    HDFStore.select
 
+SAS
+~~~
+
+.. autosummary::
+   :toctree: generated/
+
+   read_sas
+   XportReader
+
 SQL
 ~~~
 

diff --git a/doc/source/io.rst b/doc/source/io.rst
@@ -41,6 +41,7 @@ object.
     * :ref:`read_html<io.read_html>`
     * :ref:`read_gbq<io.bigquery>` (experimental)
     * :ref:`read_stata<io.stata_reader>`
+    * :ref:`read_sas<io.sas_reader>`
     * :ref:`read_clipboard<io.clipboard>`
     * :ref:`read_pickle<io.pickle>`
 
@@ -4120,6 +4121,46 @@ easy conversion to and from pandas.
 
 .. _xray: http://xray.readthedocs.org/
 
+.. _io.sas:
+
+SAS Format
+----------
+
+.. versionadded:: 0.17.0
+
+The top-level function :function:`read_sas` currently can read (but
+not write) SAS xport (.XPT) format files.  Pandas cannot currently
+handle SAS7BDAT files.
+
+XPORT files only contain two value types: ASCII text and double
+precision numeric values.  There is no automatic type conversion to
+integers, dates, or categoricals.  By default the whole file is read
+and returned as a ``DataFrame``.
+
+Specify a ``chunksize`` or use ``iterator=True`` to obtain an
+``XportReader`` object for incrementally reading the file.  The
+``XportReader`` object also has attributes that contain additional
+information about the file and its variables.
+
+Read a SAS XPORT file:
+
+.. code-block:: python
+
+    df = pd.read_sas('sas_xport.xpt')
+
+Obtain an iterator and read an XPORT file 100,000 lines at a time:
+
+.. code-block:: python
+
+    rdr = pd.read_sas('sas_xport.xpt', chunk=100000)
+    for chunk in rdr:
+        do_something(chunk)
+
+The specification_ for the xport file format is available from the SAS
+web site.
+
+.. _specification: https://support.sas.com/techsup/technote/ts140.pdf
+
 .. _io.perf:
 
 Performance Considerations

diff --git a/doc/source/whatsnew/v0.17.0.txt b/doc/source/whatsnew/v0.17.0.txt
@@ -20,6 +20,7 @@ Highlights include:
   if they are all ``NaN``, see :ref:`here <whatsnew_0170.api_breaking.hdf_dropna>`
 - Support for ``Series.dt.strftime`` to generate formatted strings for datetime-likes, see :ref:`here <whatsnew_0170.strftime>`
 - Development installed versions of pandas will now have ``PEP440`` compliant version strings (:issue:`9518`)
+- Support for reading SAS xport files, see :meth:`~pandas.io.read_sas`.
 
 Check the :ref:`API Changes <whatsnew_0170.api>` and :ref:`deprecations <whatsnew_0170.deprecations>` before updating.
 
@@ -37,7 +38,6 @@ New features
 - Enable writing complex values to HDF stores when using table format (:issue:`10447`)
 - Enable reading gzip compressed files via URL, either by explicitly setting the compression parameter or by inferring from the presence of the HTTP Content-Encoding header in the response (:issue:`8685`)
 
-
 .. _whatsnew_0170.gil:
 
 Releasing the GIL
@@ -94,6 +94,13 @@ Other enhancements
 
 - Enable `read_hdf` to be used without specifying a key when the HDF file contains a single dataset (:issue:`10443`)
 
+- :meth:`~pandas.io.read_sas` provides support for reading SAS XPORT format files:
+
+    df = pd.read_sas('sas_xport.xpt')
+
+It is also possible to obtain an iterator and read an XPORT file
+incrementally.
+
 - ``DatetimeIndex`` can be instantiated using strings contains ``NaT`` (:issue:`7599`)
 - The string parsing of ``to_datetime``, ``Timestamp`` and ``DatetimeIndex`` has been made consistent. (:issue:`7599`)
 

diff --git a/pandas/io/api.py b/pandas/io/api.py
@@ -9,6 +9,7 @@
 from pandas.io.json import read_json
 from pandas.io.html import read_html
 from pandas.io.sql import read_sql, read_sql_table, read_sql_query
+from pandas.io.sas import read_sas
 from pandas.io.stata import read_stata
 from pandas.io.pickle import read_pickle, to_pickle
 from pandas.io.packers import read_msgpack, to_msgpack