9
9
import csv
10
10
from StringIO import StringIO
11
11
import pandas as pd
12
+ ExcelWriter = pd.ExcelWriter
12
13
13
14
import numpy as np
14
15
np.random.seed(123456 )
27
28
IO Tools (Text, CSV, HDF5, ...)
28
29
*******************************
29
30
31
+ The Pandas I/O api is a set of top level ``reader `` functions accessed like ``pd.read_csv() `` that generally return a ``pandas ``
32
+ object. The corresponding ``writer `` functions are object methods that are accessed like ``df.to_csv() ``
33
+
34
+ .. csv-table ::
35
+ :widths: 12, 15, 15, 15, 15
36
+ :delim: ;
37
+
38
+ Reader; ``read_csv ``; ``read_excel ``; ``read_hdf ``; ``read_sql ``
39
+ Writer; ``to_csv ``; ``to_excel ``; ``to_hdf ``; ``to_sql ``
40
+ Reader; ``read_html ``; ``read_stata ``; ``read_clipboard `` ;
41
+ Writer; ``to_html ``; ``to_stata ``; ``to_clipboard `` ;
42
+
30
43
.. _io.read_csv_table :
31
44
32
45
CSV & Text files
@@ -971,44 +984,35 @@ And then import the data directly to a DataFrame by calling:
971
984
Excel files
972
985
-----------
973
986
974
- The ``ExcelFile `` class can read an Excel 2003 file using the ``xlrd `` Python
987
+ The ``read_excel `` method can read an Excel 2003 file using the ``xlrd `` Python
975
988
module and use the same parsing code as the above to convert tabular data into
976
989
a DataFrame. See the :ref: `cookbook<cookbook.excel> ` for some
977
990
advanced strategies
978
991
979
- To use it, create the ``ExcelFile `` object:
980
-
981
- .. code-block :: python
982
-
983
- xls = ExcelFile(' path_to_file.xls' )
984
-
985
- Then use the ``parse `` instance method with a sheetname, then use the same
986
- additional arguments as the parsers above:
987
-
988
992
.. code-block :: python
989
993
990
- xls.parse( ' Sheet1' , index_col = None , na_values = [' NA' ])
994
+ read_excel( ' path_to_file.xls ' , ' Sheet1' , index_col = None , na_values = [' NA' ])
991
995
992
996
To read sheets from an Excel 2007 file, you can pass a filename with a ``.xlsx ``
993
997
extension, in which case the ``openpyxl `` module will be used to read the file.
994
998
995
999
It is often the case that users will insert columns to do temporary computations
996
- in Excel and you may not want to read in those columns. `ExcelFile.parse ` takes
1000
+ in Excel and you may not want to read in those columns. `read_excel ` takes
997
1001
a `parse_cols ` keyword to allow you to specify a subset of columns to parse.
998
1002
999
1003
If `parse_cols ` is an integer, then it is assumed to indicate the last column
1000
1004
to be parsed.
1001
1005
1002
1006
.. code-block :: python
1003
1007
1004
- xls.parse( ' Sheet1' , parse_cols = 2 , index_col = None , na_values = [' NA' ])
1008
+ read_excel( ' path_to_file.xls ' , ' Sheet1' , parse_cols = 2 , index_col = None , na_values = [' NA' ])
1005
1009
1006
1010
If `parse_cols ` is a list of integers, then it is assumed to be the file column
1007
1011
indices to be parsed.
1008
1012
1009
1013
.. code-block :: python
1010
1014
1011
- xls.parse( ' Sheet1' , parse_cols = [0 , 2 , 3 ], index_col = None , na_values = [' NA' ])
1015
+ read_excel( ' path_to_file.xls ' , Sheet1' , parse_cols=[0, 2, 3], index_col=None, na_values=[' NA ' ])
1012
1016
1013
1017
To write a DataFrame object to a sheet of an Excel file , you can use the
1014
1018
`` to_excel`` instance method. The arguments are largely the same as `` to_csv``
@@ -1883,16 +1887,13 @@ Writing to STATA format
1883
1887
1884
1888
.. _io.StataWriter:
1885
1889
1886
- The function :func: '~pandas.io.StataWriter.write_file' will write a DataFrame
1887
- into a .dta file. The format version of this file is always the latest one,
1888
- 115.
1890
+ The method `` to_stata`` will write a DataFrame into a .dta file .
1891
+ The format version of this file is always the latest one, 115 .
1889
1892
1890
1893
.. ipython:: python
1891
1894
1892
- from pandas.io.stata import StataWriter
1893
1895
df = DataFrame(randn(10 ,2 ),columns = list (' AB' ))
1894
- writer = StataWriter(' stata.dta' ,df)
1895
- writer.write_file()
1896
+ df.to_stata(' stata.dta' )
1896
1897
1897
1898
Reading from STATA format
1898
1899
~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1901,24 +1902,21 @@ Reading from STATA format
1901
1902
1902
1903
.. versionadded:: 0.11 .1
1903
1904
1904
- The class StataReader will read the header of the given dta file at
1905
- initialization. Its function :func: '~pandas.io.StataReader.data' will
1906
- read the observations, converting them to a DataFrame which is returned:
1905
+ The top- level function `` read_stata`` will read a dta format file
1906
+ and return a DataFrame:
1907
1907
1908
1908
.. ipython:: python
1909
1909
1910
- from pandas.io.stata import StataReader
1911
- reader = StataReader(' stata.dta' )
1912
- reader.data()
1910
+ pd.read_stata(' stata.dta' )
1913
1911
1914
- The parameter convert_categoricals indicates wheter value labels should be
1915
- read and used to create a Categorical variable from them. Value labels can
1916
- also be retrieved by the function variable_labels, which requires data to be
1917
- called before.
1912
+ Currently the `` index`` is retrieved as a column on read back.
1918
1913
1919
- The StataReader supports .dta Formats 104, 105, 108, 113-115.
1914
+ The parameter `` convert_categoricals`` indicates wheter value labels should be
1915
+ read and used to create a `` Categorical`` variable from them. Value labels can
1916
+ also be retrieved by the function `` variable_labels`` , which requires data to be
1917
+ called before (see `` pandas.io.stata.StataReader`` ).
1920
1918
1921
- Alternatively, the function :func: '~pandas.io.read_stata' can be used
1919
+ The StataReader supports .dta Formats 104 , 105 , 108 , 113 - 115 .
1922
1920
1923
1921
.. ipython:: python
1924
1922
:suppress:
0 commit comments