Skip to content

Commit 57e2d99

Browse files
committed
Fix several stata doc issues
1 parent 26c3a9b commit 57e2d99

File tree

1 file changed

+17
-16
lines changed

1 file changed

+17
-16
lines changed

doc/source/io.rst

+17-16
Original file line numberDiff line numberDiff line change
@@ -3779,15 +3779,15 @@ into a .dta file. The format version of this file is always 115 (Stata 12).
37793779
df = DataFrame(randn(10, 2), columns=list('AB'))
37803780
df.to_stata('stata.dta')
37813781
3782-
*Stata* data files have limited data type support; only strings with 244 or
3783-
fewer characters, ``int8``, ``int16``, ``int32``, ``float32` and ``float64``
3784-
can be stored
3785-
in ``.dta`` files. Additionally, *Stata* reserves certain values to represent
3786-
missing data. Exporting a non-missing value that is outside of the
3787-
permitted range in Stata for a particular data type will retype the variable
3788-
to the next larger size. For example, ``int8`` values are restricted to lie
3789-
between -127 and 100 in Stata, and so variables with values above 100 will
3790-
trigger a conversion to ``int16``. ``nan`` values in floating points data
3782+
*Stata* data files have limited data type support; only strings with
3783+
244 or fewer characters, ``int8``, ``int16``, ``int32``, ``float32``
3784+
and ``float64`` can be stored in ``.dta`` files. Additionally,
3785+
*Stata* reserves certain values to represent missing data. Exporting a
3786+
non-missing value that is outside of the permitted range in Stata for
3787+
a particular data type will retype the variable to the next larger
3788+
size. For example, ``int8`` values are restricted to lie between -127
3789+
and 100 in Stata, and so variables with values above 100 will trigger
3790+
a conversion to ``int16``. ``nan`` values in floating points data
37913791
types are stored as the basic missing data type (``.`` in *Stata*).
37923792

37933793
.. note::
@@ -3810,7 +3810,7 @@ outside of this range, the variable is cast to ``int16``.
38103810

38113811
.. warning::
38123812

3813-
:class:`~pandas.io.stata.StataWriter`` and
3813+
:class:`~pandas.io.stata.StataWriter` and
38143814
:func:`~pandas.core.frame.DataFrame.to_stata` only support fixed width
38153815
strings containing up to 244 characters, a limitation imposed by the version
38163816
115 dta file format. Attempting to write *Stata* dta files with strings
@@ -3836,9 +3836,10 @@ Specifying a ``chunksize`` yields a
38363836
read ``chunksize`` lines from the file at a time. The ``StataReader``
38373837
object can be used as an iterator.
38383838

3839-
reader = pd.read_stata('stata.dta', chunksize=1000)
3840-
for df in reader:
3841-
do_something(df)
3839+
.. ipython:: python
3840+
reader = pd.read_stata('stata.dta', chunksize=3)
3841+
for df in reader:
3842+
print(df.shape)
38423843
38433844
For more fine-grained control, use ``iterator=True`` and specify
38443845
``chunksize`` with each call to
@@ -3847,8 +3848,8 @@ For more fine-grained control, use ``iterator=True`` and specify
38473848
.. ipython:: python
38483849
38493850
reader = pd.read_stata('stata.dta', iterator=True)
3850-
chunk1 = reader.read(10)
3851-
chunk2 = reader.read(20)
3851+
chunk1 = reader.read(5)
3852+
chunk2 = reader.read(5)
38523853
38533854
Currently the ``index`` is retrieved as a column.
38543855

@@ -3869,7 +3870,7 @@ formats 104, 105, 108, 113-115 (Stata 10-12) and 117 (Stata 13+).
38693870
.. note::
38703871

38713872
Setting ``preserve_dtypes=False`` will upcast to the standard pandas data types:
3872-
``int64`` for all integer types and ``float64`` for floating poitn data. By default,
3873+
``int64`` for all integer types and ``float64`` for floating point data. By default,
38733874
the Stata data types are preserved when importing.
38743875

38753876
.. ipython:: python

0 commit comments

Comments
 (0)