Skip to content

Fix several stata doc issues #9601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 6, 2015
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 17 additions & 16 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3779,15 +3779,15 @@ into a .dta file. The format version of this file is always 115 (Stata 12).
df = DataFrame(randn(10, 2), columns=list('AB'))
df.to_stata('stata.dta')

*Stata* data files have limited data type support; only strings with 244 or
fewer characters, ``int8``, ``int16``, ``int32``, ``float32` and ``float64``
can be stored
in ``.dta`` files. Additionally, *Stata* reserves certain values to represent
missing data. Exporting a non-missing value that is outside of the
permitted range in Stata for a particular data type will retype the variable
to the next larger size. For example, ``int8`` values are restricted to lie
between -127 and 100 in Stata, and so variables with values above 100 will
trigger a conversion to ``int16``. ``nan`` values in floating points data
*Stata* data files have limited data type support; only strings with
244 or fewer characters, ``int8``, ``int16``, ``int32``, ``float32``
and ``float64`` can be stored in ``.dta`` files. Additionally,
*Stata* reserves certain values to represent missing data. Exporting a
non-missing value that is outside of the permitted range in Stata for
a particular data type will retype the variable to the next larger
size. For example, ``int8`` values are restricted to lie between -127
and 100 in Stata, and so variables with values above 100 will trigger
a conversion to ``int16``. ``nan`` values in floating points data
types are stored as the basic missing data type (``.`` in *Stata*).

.. note::
Expand All @@ -3810,7 +3810,7 @@ outside of this range, the variable is cast to ``int16``.

.. warning::

:class:`~pandas.io.stata.StataWriter`` and
:class:`~pandas.io.stata.StataWriter` and
:func:`~pandas.core.frame.DataFrame.to_stata` only support fixed width
strings containing up to 244 characters, a limitation imposed by the version
115 dta file format. Attempting to write *Stata* dta files with strings
Expand All @@ -3836,9 +3836,10 @@ Specifying a ``chunksize`` yields a
read ``chunksize`` lines from the file at a time. The ``StataReader``
object can be used as an iterator.

reader = pd.read_stata('stata.dta', chunksize=1000)
for df in reader:
do_something(df)
.. ipython:: python
reader = pd.read_stata('stata.dta', chunksize=3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be a blank line between the .. ipython:: python (directive command) and the actual code content

for df in reader:
print(df.shape)

For more fine-grained control, use ``iterator=True`` and specify
``chunksize`` with each call to
Expand All @@ -3847,8 +3848,8 @@ For more fine-grained control, use ``iterator=True`` and specify
.. ipython:: python

reader = pd.read_stata('stata.dta', iterator=True)
chunk1 = reader.read(10)
chunk2 = reader.read(20)
chunk1 = reader.read(5)
chunk2 = reader.read(5)

Currently the ``index`` is retrieved as a column.

Expand All @@ -3869,7 +3870,7 @@ formats 104, 105, 108, 113-115 (Stata 10-12) and 117 (Stata 13+).
.. note::

Setting ``preserve_dtypes=False`` will upcast to the standard pandas data types:
``int64`` for all integer types and ``float64`` for floating poitn data. By default,
``int64`` for all integer types and ``float64`` for floating point data. By default,
the Stata data types are preserved when importing.

.. ipython:: python
Expand Down