|
79 | 79 | )
|
80 | 80 | _read_excel_doc = (
|
81 | 81 | """
|
82 |
| -Read an Excel file into a ``pandas`` ``DataFrame``. |
| 82 | +Read an Excel file into a ``pandas`` DataFrame. |
83 | 83 |
|
84 | 84 | Supports `xls`, `xlsx`, `xlsm`, `xlsb`, `odf`, `ods` and `odt` file extensions
|
85 | 85 | read from a local filesystem or URL. Supports an option to read
|
|
105 | 105 |
|
106 | 106 | Available cases:
|
107 | 107 |
|
108 |
| - * Defaults to ``0``: 1st sheet as a ``DataFrame`` |
109 |
| - * ``1``: 2nd sheet as a ``DataFrame`` |
| 108 | + * Defaults to ``0``: 1st sheet as a DataFrame |
| 109 | + * ``1``: 2nd sheet as a DataFrame |
110 | 110 | * ``"Sheet1"``: Load sheet with name "Sheet1"
|
111 | 111 | * ``[0, 1, "Sheet5"]``: Load first, second and sheet named "Sheet5"
|
112 |
| - as a dict of ``DataFrame`` |
| 112 | + as a dict of DataFrame |
113 | 113 | * ``None``: All worksheets.
|
114 | 114 |
|
115 | 115 | header : int, list of int, default 0
|
116 | 116 | Row (0-indexed) to use for the column labels of the parsed
|
117 |
| - ``DataFrame``. If a list of integers is passed those row positions will |
| 117 | + DataFrame. If a list of integers is passed those row positions will |
118 | 118 | be combined into a ``MultiIndex``. Use ``None`` if there is no header.
|
119 | 119 | names : array-like, default None
|
120 | 120 | List of column names to use. If file contains no header row,
|
121 | 121 | then you should explicitly pass ``header=None``.
|
122 | 122 | index_col : int, str, list of int, default None
|
123 |
| - Column (0-indexed) to use as the row labels of the ``DataFrame``. |
| 123 | + Column (0-indexed) to use as the row labels of the DataFrame. |
124 | 124 | Pass None if there is no such column. If a list is passed,
|
125 | 125 | those columns will be combined into a ``MultiIndex``. If a
|
126 | 126 | subset of data is selected with ``usecols``, ``index_col``
|
|
143 | 143 |
|
144 | 144 | Returns a subset of the columns according to behavior above.
|
145 | 145 | dtype : Type name or dict of column -> type, default None
|
146 |
| - Data type for data or columns. E.g. ``{'a': np.float64, 'b': np.int32}`` |
| 146 | + Data type for data or columns. E.g. ``{{'a': np.float64, 'b': np.int32}}`` |
147 | 147 | Use `object` to preserve data as stored in Excel and not interpret dtype.
|
148 | 148 | If converters are specified, they will be applied INSTEAD
|
149 | 149 | of ``dtype`` conversion.
|
|
152 | 152 | Supported engines: ``"xlrd"``, ``"openpyxl"``, ``"odf"``, ``"pyxlsb"``.
|
153 | 153 | Engine compatibility :
|
154 | 154 |
|
155 |
| - - ``"xlrd"`` supports old-style Excel files (.xls). |
156 |
| - - ``"openpyxl"`` supports newer Excel file formats. |
157 |
| - - ``"odf"`` supports OpenDocument file formats (.odf, .ods, .odt). |
158 |
| - - ``"pyxlsb"`` supports Binary Excel files. |
| 155 | + - ``xlrd`` supports old-style Excel files (.xls). |
| 156 | + - ``openpyxl`` supports newer Excel file formats. |
| 157 | + - ``odf`` supports OpenDocument file formats (.odf, .ods, .odt). |
| 158 | + - ``pyxlsb`` supports Binary Excel files. |
159 | 159 |
|
160 | 160 | .. versionchanged:: 1.2.0
|
161 | 161 | The engine `xlrd <https://xlrd.readthedocs.io/en/latest/>`_
|
|
215 | 215 | ``na_values`` parameters will be ignored.
|
216 | 216 | na_filter : bool, default True
|
217 | 217 | Detect missing value markers (empty strings and the value of ``na_values``). In
|
218 |
| - data without any NAs, ``passing na_filter=False`` can improve the performance |
| 218 | + data without any NAs, passing ``na_filter=False`` can improve the performance |
219 | 219 | of reading a large file.
|
220 | 220 | verbose : bool, default False
|
221 | 221 | Indicate number of NA values placed in non-numeric columns.
|
|
233 | 233 | If a column or index contains an unparsable date, the entire column or
|
234 | 234 | index will be returned unaltered as an object data type. If you don`t want to
|
235 | 235 | parse some cells as date, just change their type in Excel to "Text".
|
236 |
| - For non-standard ``datetime`` parsing, use ``pd.to_datetime`` after ``pd.read_excel``. |
| 236 | + For non-standard datetime parsing, use ``pd.to_datetime`` after ``pd.read_excel``. |
237 | 237 |
|
238 | 238 | Note: A fast-path exists for iso8601-formatted dates.
|
239 | 239 | date_parser : function, optional
|
|
279 | 279 |
|
280 | 280 | .. versionadded:: 1.2.0
|
281 | 281 |
|
282 |
| -dtype_backend : {{"numpy_nullable", "pyarrow"}}, defaults to ``numpy`` backed ``DataFrames`` |
283 |
| - Which ``dtype_backend`` to use, e.g. whether a ``DataFrame`` should have ``numpy`` |
| 282 | +dtype_backend : {{"numpy_nullable", "pyarrow"}}, defaults to NumPy backed ``DataFrames`` |
| 283 | + Which ``dtype_backend`` to use, e.g. whether a DataFrame should have NumPy |
284 | 284 | arrays, nullable ``dtypes`` are used for all ``dtypes`` that have a nullable
|
285 | 285 | implementation when ``"numpy_nullable"`` is set, ``pyarrow`` is used for all
|
286 | 286 | dtypes if ``"pyarrow"`` is set.
|
|
295 | 295 | Returns
|
296 | 296 | -------
|
297 | 297 | DataFrame or dict of DataFrames
|
298 |
| - ``DataFrame`` from the passed in Excel file. See notes in ``sheet_name`` |
| 298 | + DataFrame from the passed in Excel file. See notes in ``sheet_name`` |
299 | 299 | argument for more information on when a ``dict`` of ``DataFrames`` is returned.
|
300 | 300 |
|
301 | 301 | See Also
|
302 | 302 | --------
|
303 |
| -DataFrame.to_excel : Write ``DataFrame`` to an Excel file. |
304 |
| -DataFrame.to_csv : Write ``DataFrame`` to a comma-separated values (csv) file. |
305 |
| -read_csv : Read a comma-separated values (csv) file into ``DataFrame``. |
306 |
| -read_fwf : Read a table of fixed-width formatted lines into ``DataFrame``. |
| 303 | +DataFrame.to_excel : Write DataFrame to an Excel file. |
| 304 | +DataFrame.to_csv : Write DataFrame to a comma-separated values (csv) file. |
| 305 | +read_csv : Read a comma-separated values (csv) file into DataFrame. |
| 306 | +read_fwf : Read a table of fixed-width formatted lines into DataFrame. |
307 | 307 |
|
308 | 308 | Notes
|
309 | 309 | -----
|
|
345 | 345 | 1 string2 2.0
|
346 | 346 | 2 #Comment 3.0
|
347 | 347 |
|
348 |
| -``True``, ``False``, ``NaN`` values, and thousands of separators have defaults, |
| 348 | +True, False, NA values, and thousands of separators have defaults, |
349 | 349 | but can be explicitly specified, too. Supply the values you would like
|
350 | 350 | as strings or lists of strings!
|
351 | 351 |
|
|
356 | 356 | 1 NaN 2
|
357 | 357 | 2 #Comment 3
|
358 | 358 |
|
359 |
| -Comment lines in the excel input file can be skipped using the ``comment`` ``kwarg`` |
| 359 | +Comment lines in the excel input file can be skipped using the ``comment`` keyword argument |
360 | 360 |
|
361 | 361 | >>> pd.read_excel('tmp.xlsx', index_col=0, comment='#') # doctest: +SKIP
|
362 | 362 | Name Value
|
|
0 commit comments