@@ -979,6 +979,12 @@ class ExcelWriter(Generic[_WorkbookT]):
979
979
980
980
.. versionadded:: 1.3.0
981
981
982
+ See Also
983
+ --------
984
+ read_excel : Read an Excel sheet values (xlsx) file into DataFrame.
985
+ read_csv : Read a comma-separated values (csv) file into DataFrame.
986
+ read_fwf : Read a table of fixed-width formatted lines into DataFrame.
987
+
982
988
Notes
983
989
-----
984
990
For compatibility with CSV writers, ExcelWriter serializes lists
@@ -1434,6 +1440,7 @@ def inspect_excel_format(
1434
1440
return "zip"
1435
1441
1436
1442
1443
+ @doc (storage_options = _shared_docs ["storage_options" ])
1437
1444
class ExcelFile :
1438
1445
"""
1439
1446
Class for parsing tabular Excel sheets into DataFrame objects.
@@ -1472,19 +1479,27 @@ class ExcelFile:
1472
1479
- Otherwise if ``path_or_buffer`` is in xlsb format,
1473
1480
`pyxlsb <https://pypi.org/project/pyxlsb/>`_ will be used.
1474
1481
1475
- .. versionadded:: 1.3.0
1482
+ .. versionadded:: 1.3.0
1476
1483
1477
1484
- Otherwise if `openpyxl <https://pypi.org/project/openpyxl/>`_ is installed,
1478
1485
then ``openpyxl`` will be used.
1479
1486
- Otherwise if ``xlrd >= 2.0`` is installed, a ``ValueError`` will be raised.
1480
1487
1481
- .. warning::
1488
+ .. warning::
1482
1489
1483
- Please do not report issues when using ``xlrd`` to read ``.xlsx`` files.
1484
- This is not supported, switch to using ``openpyxl`` instead.
1490
+ Please do not report issues when using ``xlrd`` to read ``.xlsx`` files.
1491
+ This is not supported, switch to using ``openpyxl`` instead.
1492
+ {storage_options}
1485
1493
engine_kwargs : dict, optional
1486
1494
Arbitrary keyword arguments passed to excel engine.
1487
1495
1496
+ See Also
1497
+ --------
1498
+ DataFrame.to_excel : Write DataFrame to an Excel file.
1499
+ DataFrame.to_csv : Write DataFrame to a comma-separated values (csv) file.
1500
+ read_csv : Read a comma-separated values (csv) file into DataFrame.
1501
+ read_fwf : Read a table of fixed-width formatted lines into DataFrame.
1502
+
1488
1503
Examples
1489
1504
--------
1490
1505
>>> file = pd.ExcelFile("myfile.xlsx") # doctest: +SKIP
@@ -1595,11 +1610,134 @@ def parse(
1595
1610
Equivalent to read_excel(ExcelFile, ...) See the read_excel
1596
1611
docstring for more info on accepted parameters.
1597
1612
1613
+ Parameters
1614
+ ----------
1615
+ sheet_name : str, int, list, or None, default 0
1616
+ Strings are used for sheet names. Integers are used in zero-indexed
1617
+ sheet positions (chart sheets do not count as a sheet position).
1618
+ Lists of strings/integers are used to request multiple sheets.
1619
+ Specify ``None`` to get all worksheets.
1620
+ header : int, list of int, default 0
1621
+ Row (0-indexed) to use for the column labels of the parsed
1622
+ DataFrame. If a list of integers is passed those row positions will
1623
+ be combined into a ``MultiIndex``. Use None if there is no header.
1624
+ names : array-like, default None
1625
+ List of column names to use. If file contains no header row,
1626
+ then you should explicitly pass header=None.
1627
+ index_col : int, str, list of int, default None
1628
+ Column (0-indexed) to use as the row labels of the DataFrame.
1629
+ Pass None if there is no such column. If a list is passed,
1630
+ those columns will be combined into a ``MultiIndex``. If a
1631
+ subset of data is selected with ``usecols``, index_col
1632
+ is based on the subset.
1633
+
1634
+ Missing values will be forward filled to allow roundtripping with
1635
+ ``to_excel`` for ``merged_cells=True``. To avoid forward filling the
1636
+ missing values use ``set_index`` after reading the data instead of
1637
+ ``index_col``.
1638
+ usecols : str, list-like, or callable, default None
1639
+ * If None, then parse all columns.
1640
+ * If str, then indicates comma separated list of Excel column letters
1641
+ and column ranges (e.g. "A:E" or "A,C,E:F"). Ranges are inclusive of
1642
+ both sides.
1643
+ * If list of int, then indicates list of column numbers to be parsed
1644
+ (0-indexed).
1645
+ * If list of string, then indicates list of column names to be parsed.
1646
+ * If callable, then evaluate each column name against it and parse the
1647
+ column if the callable returns ``True``.
1648
+
1649
+ Returns a subset of the columns according to behavior above.
1650
+ converters : dict, default None
1651
+ Dict of functions for converting values in certain columns. Keys can
1652
+ either be integers or column labels, values are functions that take one
1653
+ input argument, the Excel cell content, and return the transformed
1654
+ content.
1655
+ true_values : list, default None
1656
+ Values to consider as True.
1657
+ false_values : list, default None
1658
+ Values to consider as False.
1659
+ skiprows : list-like, int, or callable, optional
1660
+ Line numbers to skip (0-indexed) or number of lines to skip (int) at the
1661
+ start of the file. If callable, the callable function will be evaluated
1662
+ against the row indices, returning True if the row should be skipped and
1663
+ False otherwise. An example of a valid callable argument would be ``lambda
1664
+ x: x in [0, 2]``.
1665
+ nrows : int, default None
1666
+ Number of rows to parse.
1667
+ na_values : scalar, str, list-like, or dict, default None
1668
+ Additional strings to recognize as NA/NaN. If dict passed, specific
1669
+ per-column NA values.
1670
+ parse_dates : bool, list-like, or dict, default False
1671
+ The behavior is as follows:
1672
+
1673
+ * ``bool``. If True -> try parsing the index.
1674
+ * ``list`` of int or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3
1675
+ each as a separate date column.
1676
+ * ``list`` of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and
1677
+ parse as a single date column.
1678
+ * ``dict``, e.g. {{'foo' : [1, 3]}} -> parse columns 1, 3 as date and call
1679
+ result 'foo'
1680
+
1681
+ If a column or index contains an unparsable date, the entire column or
1682
+ index will be returned unaltered as an object data type. If you
1683
+ don`t want to parse some cells as date just change their type
1684
+ in Excel to "Text".For non-standard datetime parsing, use
1685
+ ``pd.to_datetime`` after ``pd.read_excel``.
1686
+
1687
+ Note: A fast-path exists for iso8601-formatted dates.
1688
+ date_parser : function, optional
1689
+ Function to use for converting a sequence of string columns to an array of
1690
+ datetime instances. The default uses ``dateutil.parser.parser`` to do the
1691
+ conversion. Pandas will try to call `date_parser` in three different ways,
1692
+ advancing to the next if an exception occurs: 1) Pass one or more arrays
1693
+ (as defined by `parse_dates`) as arguments; 2) concatenate (row-wise) the
1694
+ string values from the columns defined by `parse_dates` into a single array
1695
+ and pass that; and 3) call `date_parser` once for each row using one or
1696
+ more strings (corresponding to the columns defined by `parse_dates`) as
1697
+ arguments.
1698
+
1699
+ .. deprecated:: 2.0.0
1700
+ Use ``date_format`` instead, or read in as ``object`` and then apply
1701
+ :func:`to_datetime` as-needed.
1702
+ date_format : str or dict of column -> format, default ``None``
1703
+ If used in conjunction with ``parse_dates``, will parse dates
1704
+ according to this format. For anything more complex,
1705
+ please read in as ``object`` and then apply :func:`to_datetime` as-needed.
1706
+ thousands : str, default None
1707
+ Thousands separator for parsing string columns to numeric. Note that
1708
+ this parameter is only necessary for columns stored as TEXT in Excel,
1709
+ any numeric columns will automatically be parsed, regardless of display
1710
+ format.
1711
+ comment : str, default None
1712
+ Comments out remainder of line. Pass a character or characters to this
1713
+ argument to indicate comments in the input file. Any data between the
1714
+ comment string and the end of the current line is ignored.
1715
+ skipfooter : int, default 0
1716
+ Rows at the end to skip (0-indexed).
1717
+ dtype_backend : {{'numpy_nullable', 'pyarrow'}}, default 'numpy_nullable'
1718
+ Back-end data type applied to the resultant :class:`DataFrame`
1719
+ (still experimental). Behaviour is as follows:
1720
+
1721
+ * ``"numpy_nullable"``: returns nullable-dtype-backed :class:`DataFrame`
1722
+ (default).
1723
+ * ``"pyarrow"``: returns pyarrow-backed nullable :class:`ArrowDtype`
1724
+ DataFrame.
1725
+
1726
+ .. versionadded:: 2.0
1727
+ **kwds : dict, optional
1728
+ Arbitrary keyword arguments passed to excel engine.
1729
+
1598
1730
Returns
1599
1731
-------
1600
1732
DataFrame or dict of DataFrames
1601
1733
DataFrame from the passed in Excel file.
1602
1734
1735
+ See Also
1736
+ --------
1737
+ read_excel : Read an Excel sheet values (xlsx) file into DataFrame.
1738
+ read_csv : Read a comma-separated values (csv) file into DataFrame.
1739
+ read_fwf : Read a table of fixed-width formatted lines into DataFrame.
1740
+
1603
1741
Examples
1604
1742
--------
1605
1743
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=["A", "B", "C"])
0 commit comments