Skip to content

Commit 6b586d0

Browse files
committed
DOC: add documentation for read_spss(pandas-dev#27476)
1 parent d7eb306 commit 6b586d0

File tree

2 files changed

+44
-0
lines changed

2 files changed

+44
-0
lines changed

doc/source/reference/io.rst

+7
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,13 @@ SAS
105105

106106
read_sas
107107

108+
SPSS
109+
~~~~
110+
.. autosummary::
111+
:toctree: api/
112+
113+
read_spss
114+
108115
SQL
109116
~~~
110117
.. autosummary::

doc/source/user_guide/io.rst

+37
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like
3939
binary;`Msgpack <https://msgpack.org/index.html>`__;:ref:`read_msgpack<io.msgpack>`;:ref:`to_msgpack<io.msgpack>`
4040
binary;`Stata <https://en.wikipedia.org/wiki/Stata>`__;:ref:`read_stata<io.stata_reader>`;:ref:`to_stata<io.stata_writer>`
4141
binary;`SAS <https://en.wikipedia.org/wiki/SAS_(software)>`__;:ref:`read_sas<io.sas_reader>`;
42+
binary;`SPSS <https://en.wikipedia.org/wiki/SPSS>`__;:ref:`read_spss<io.spss_reader>`;
4243
binary;`Python Pickle Format <https://docs.python.org/3/library/pickle.html>`__;:ref:`read_pickle<io.pickle>`;:ref:`to_pickle<io.pickle>`
4344
SQL;`SQL <https://en.wikipedia.org/wiki/SQL>`__;:ref:`read_sql<io.sql>`;:ref:`to_sql<io.sql>`
4445
SQL;`Google Big Query <https://en.wikipedia.org/wiki/BigQuery>`__;:ref:`read_gbq<io.bigquery>`;:ref:`to_gbq<io.bigquery>`
@@ -5477,6 +5478,42 @@ web site.
54775478

54785479
No official documentation is available for the SAS7BDAT format.
54795480

5481+
.. _io.spss:
5482+
5483+
.. _io.spss_reader:
5484+
5485+
SPSS formats
5486+
-----------
5487+
5488+
The top-level function :func:`read_spss` can read (but not write) SPSS
5489+
`sav` (.sav) and `zsav` (.zsav) format files(since *v0.25.0*).
5490+
5491+
SPSS files contain column names. By default the
5492+
whole file is read, categorical columns are converted into ``pd.Categorical``
5493+
and a ``DataFrame`` with all columns is returned.
5494+
5495+
Specify a ``usecols`` to obtain a subset of columns. Specify ``convert_categoricals=False``
5496+
to avoid converting categorical columns into ``pd.Categorical``.
5497+
5498+
Read a spss file:
5499+
5500+
.. code-block:: python
5501+
5502+
df = pd.read_spss('spss_data.zsav')
5503+
5504+
Extract a subset of columns ``usecols`` from SPSS file and
5505+
avoid converting categorical columns into ``pd.Categorical``:
5506+
5507+
.. code-block:: python
5508+
5509+
df = pd.read_spss('spss_data.zsav', usecols=['foo', 'bar'],
5510+
convert_categoricals=False)
5511+
5512+
More info_ about the sav and zsav file format is available from the IBM
5513+
web site.
5514+
5515+
.. _info: https://www.ibm.com/support/knowledgecenter/en/SSLVMB_22.0.0/com.ibm.spss.statistics.help/spss/base/savedatatypes.htm
5516+
54805517
.. _io.other:
54815518

54825519
Other file formats

0 commit comments

Comments
 (0)