Skip to content

Commit d8068e5

Browse files
authored
DOC: Improve userguide for index_col and usecols in read_csv (#44643)
1 parent 845d164 commit d8068e5

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

doc/source/user_guide/io.rst

+14-1
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,13 @@ index_col : int, str, sequence of int / str, or False, optional, default ``None`
116116
of the data file, then a default index is used. If it is larger, then
117117
the first columns are used as index so that the remaining number of fields in
118118
the body are equal to the number of fields in the header.
119+
120+
The first row after the header is used to determine the number of columns,
121+
which will go into the index. If the subsequent rows contain less columns
122+
than the first row, they are filled with ``NaN``.
123+
124+
This can be avoided through ``usecols``. This ensures that the columns are
125+
taken as is and the trailing data are ignored.
119126
usecols : list-like or callable, default ``None``
120127
Return a subset of the columns. If list-like, all elements must either
121128
be positional (i.e. integer indices into the document columns) or strings
@@ -143,9 +150,15 @@ usecols : list-like or callable, default ``None``
143150
pd.read_csv(StringIO(data))
144151
pd.read_csv(StringIO(data), usecols=lambda x: x.upper() in ["COL1", "COL3"])
145152
146-
Using this parameter results in much faster parsing time and lower memory usage.
153+
Using this parameter results in much faster parsing time and lower memory usage
154+
when using the c engine. The Python engine loads the data first before deciding
155+
which columns to drop.
147156
squeeze : boolean, default ``False``
148157
If the parsed data only contains one column then return a ``Series``.
158+
159+
.. deprecated:: 1.4.0
160+
Append ``.squeeze("columns")`` to the call to ``{func_name}`` to squeeze
161+
the data.
149162
prefix : str, default ``None``
150163
Prefix to add to column numbers when no header, e.g. 'X' for X0, X1, ...
151164
mangle_dupe_cols : boolean, default ``True``

0 commit comments

Comments
 (0)