Skip to content

Commit c0181d1

Browse files
committed
Merge branch 'master' of https://github.com/PKEuS/pandas into PKEuS-master
Conflicts: RELEASE.rst
2 parents 2c6f4c6 + 4f60da9 commit c0181d1

File tree

12 files changed

+1383
-0
lines changed

12 files changed

+1383
-0
lines changed

RELEASE.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ pandas 0.11.1
3333
- pd.read_html() can now parse HTML string, files or urls and return dataframes
3434
courtesy of @cpcloud. (GH3477_)
3535
- Support for reading Amazon S3 files. (GH3504_)
36+
- Added module for reading and writing Stata files: pandas.io.stata (GH1512_)
3637

3738
**Improvements to existing features**
3839

@@ -169,6 +170,7 @@ pandas 0.11.1
169170
.. _GH3596: https://github.com/pydata/pandas/issues/3596
170171
.. _GH3435: https://github.com/pydata/pandas/issues/3435
171172
.. _GH3611: https://github.com/pydata/pandas/issues/3611
173+
.. _GH1512: https://github.com/pydata/pandas/issues/1512
172174

173175

174176
pandas 0.11.0

doc/source/io.rst

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1831,3 +1831,44 @@ There are a few other available functions:
18311831

18321832
For now, writing your DataFrame into a database works only with
18331833
**SQLite**. Moreover, the **index** will currently be **dropped**.
1834+
1835+
1836+
Reading from STATA format
1837+
~~~~~~~~~~~~~~~~~~~~~~
1838+
1839+
.. _io.StataReader:
1840+
1841+
.. versionadded:: 0.11.1
1842+
1843+
The class StataReader will read the header of the given dta file at
1844+
initialization. Its function :func:'~pandas.io.StataReader.data' will
1845+
read the observations, converting them to a DataFrame which is returned:
1846+
1847+
.. ipython:: python
1848+
reader = StataReader(dta_filepath)
1849+
dataframe = reader.data()
1850+
1851+
The parameter convert_categoricals indicates wheter value labels should be
1852+
read and used to create a Categorical variable from them. Value labels can
1853+
also be retrieved by the function variable_labels, which requires data to be
1854+
called before.
1855+
The StataReader supports .dta Formats 104, 105, 108, 113-115.
1856+
1857+
Alternatively, the function :func:'~pandas.io.read_stata' can be used:
1858+
1859+
.. ipython:: python
1860+
dataframe = read_stata(dta_filepath)
1861+
1862+
1863+
Writing to STATA format
1864+
~~~~~~~~~~~~~~~~~~~~~~
1865+
1866+
.. _io.StataWriter:
1867+
1868+
The function :func:'~pandas.io.StataWriter.write_file' will write a DataFrame
1869+
into a .dta file. The format version of this file is always the latest one,
1870+
115.
1871+
1872+
.. ipython:: python
1873+
writer = StataWriter(filename, dataframe)
1874+
writer.write_file()

doc/source/v0.11.1.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ Enhancements
5454
- support datelike columns with a timezone as data_columns (GH2852_)
5555
- ``fillna`` methods now raise a ``TypeError`` if the ``value`` parameter is
5656
a list or tuple.
57+
- Added module for reading and writing Stata files: pandas.io.stata (GH1512_)
5758

5859
See the `full release notes
5960
<https://github.com/pydata/pandas/blob/master/RELEASE.rst>`__ or issue tracker
@@ -68,3 +69,4 @@ on GitHub for a complete list.
6869
.. _GH3596: https://github.com/pydata/pandas/issues/3596
6970
.. _GH3590: https://github.com/pydata/pandas/issues/3590
7071
.. _GH3435: https://github.com/pydata/pandas/issues/3435
72+
.. _GH1512: https://github.com/pydata/pandas/issues/1512

pandas/core/frame.py

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1280,6 +1280,35 @@ def from_csv(cls, path, header=0, sep=',', index_col=0,
12801280
parse_dates=parse_dates, index_col=index_col,
12811281
encoding=encoding)
12821282

1283+
@classmethod
1284+
def from_dta(dta, path, parse_dates=True, convert_categoricals=True, encoding=None, index_col=None):
1285+
"""
1286+
Read Stata file into DataFrame
1287+
1288+
Parameters
1289+
----------
1290+
path : string file path or file handle / StringIO
1291+
parse_dates : boolean, default True
1292+
Convert date variables to DataFrame time values
1293+
convert_categoricals : boolean, default True
1294+
Read value labels and convert columns to Categorical/Factor variables
1295+
encoding : string, None or encoding, default None
1296+
Encoding used to parse the files. Note that Stata doesn't
1297+
support unicode. None defaults to cp1252.
1298+
index_col : int or sequence, default None
1299+
Column to use for index. If a sequence is given, a MultiIndex
1300+
is used. Different default from read_table
1301+
1302+
Notes
1303+
-----
1304+
1305+
Returns
1306+
-------
1307+
y : DataFrame
1308+
"""
1309+
from pandas.io.stata import read_stata
1310+
return read_stata(path, parse_dates=parse_dates, convert_categoricals=convert_categoricals, encoding=encoding, index=index_col)
1311+
12831312
def to_sparse(self, fill_value=None, kind='block'):
12841313
"""
12851314
Convert to SparseDataFrame

0 commit comments

Comments
 (0)