Skip to content

Commit e7d3b09

Browse files
committed
initial docstring fix at parsers.py
1 parent 24ab22f commit e7d3b09

File tree

1 file changed

+50
-47
lines changed

1 file changed

+50
-47
lines changed

pandas/io/parsers.py

+50-47
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
"""
22
Module contains tools for processing files into DataFrames or other objects
33
"""
4+
45
from __future__ import print_function
56

67
from collections import defaultdict
@@ -71,7 +72,7 @@
7172
By file-like object, we refer to objects with a ``read()`` method, such as
7273
a file handler (e.g. via builtin ``open`` function) or ``StringIO``.
7374
%s
74-
delim_whitespace : boolean, default False
75+
delim_whitespace : bool, default False
7576
Specifies whether or not whitespace (e.g. ``' '`` or ``'\t'``) will be
7677
used as the sep. Equivalent to setting ``sep='\s+'``. If this option
7778
is set to True, nothing should be passed in for the ``delimiter``
@@ -101,7 +102,7 @@
101102
Column to use as the row labels of the DataFrame. If a sequence is given, a
102103
MultiIndex is used. If you have a malformed file with delimiters at the end
103104
of each line, you might consider index_col=False to force pandas to _not_
104-
use the first column as the index (row names)
105+
use the first column as the index (row names).
105106
usecols : list-like or callable, default None
106107
Return a subset of the columns. If list-like, all elements must either
107108
be positional (i.e. integer indices into the document columns) or strings
@@ -120,11 +121,11 @@
120121
example of a valid callable argument would be ``lambda x: x.upper() in
121122
['AAA', 'BBB', 'DDD']``. Using this parameter results in much faster
122123
parsing time and lower memory usage.
123-
squeeze : boolean, default False
124-
If the parsed data only contains one column then return a Series
124+
squeeze : bool, default False
125+
If the parsed data only contains one column then return a Series.
125126
prefix : str, default None
126127
Prefix to add to column numbers when no header, e.g. 'X' for X0, X1, ...
127-
mangle_dupe_cols : boolean, default True
128+
mangle_dupe_cols : bool, default True
128129
Duplicate columns will be specified as 'X', 'X.1', ...'X.N', rather than
129130
'X'...'X'. Passing in False will cause data to be overwritten if there
130131
are duplicate names in the columns.
@@ -137,24 +138,24 @@
137138
%s
138139
converters : dict, default None
139140
Dict of functions for converting values in certain columns. Keys can either
140-
be integers or column labels
141+
be integers or column labels.
141142
true_values : list, default None
142-
Values to consider as True
143+
Values to consider as True.
143144
false_values : list, default None
144-
Values to consider as False
145-
skipinitialspace : boolean, default False
145+
Values to consider as False.
146+
skipinitialspace : bool, default False
146147
Skip spaces after delimiter.
147-
skiprows : list-like or integer or callable, default None
148+
skiprows : list-like or int or callable, default None
148149
Line numbers to skip (0-indexed) or number of lines to skip (int)
149150
at the start of the file.
150151
151152
If callable, the callable function will be evaluated against the row
152153
indices, returning True if the row should be skipped and False otherwise.
153154
An example of a valid callable argument would be ``lambda x: x in [0, 2]``.
154155
skipfooter : int, default 0
155-
Number of lines at bottom of file to skip (Unsupported with engine='c')
156+
Number of lines at bottom of file to skip (Unsupported with engine='c').
156157
nrows : int, default None
157-
Number of rows of file to read. Useful for reading pieces of large files
158+
Number of rows of file to read. Useful for reading pieces of large files.
158159
na_values : scalar, str, list-like, or dict, default None
159160
Additional strings to recognize as NA/NaN. If dict passed, specific
160161
per-column NA values. By default the following values are interpreted as
@@ -175,16 +176,17 @@
175176
176177
Note that if `na_filter` is passed in as False, the `keep_default_na` and
177178
`na_values` parameters will be ignored.
178-
na_filter : boolean, default True
179+
na_filter : bool, default True
179180
Detect missing value markers (empty strings and the value of na_values). In
180181
data without any NAs, passing na_filter=False can improve the performance
181-
of reading a large file
182-
verbose : boolean, default False
183-
Indicate number of NA values placed in non-numeric columns
184-
skip_blank_lines : boolean, default True
185-
If True, skip over blank lines rather than interpreting as NaN values
186-
parse_dates : boolean or list of ints or names or list of lists or dict, \
182+
of reading a large file.
183+
verbose : bool, default False
184+
Indicate number of NA values placed in non-numeric columns.
185+
skip_blank_lines : bool, default True
186+
If True, skip over blank lines rather than interpreting as NaN values.
187+
parse_dates : bool or list of ints or names or list of lists or dict, \
187188
default False
189+
The behavior is as follows:
188190
189191
* boolean. If True -> try parsing the index.
190192
* list of ints or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3
@@ -199,12 +201,12 @@
199201
datetime parsing, use ``pd.to_datetime`` after ``pd.read_csv``
200202
201203
Note: A fast-path exists for iso8601-formatted dates.
202-
infer_datetime_format : boolean, default False
204+
infer_datetime_format : bool, default False
203205
If True and `parse_dates` is enabled, pandas will attempt to infer the
204206
format of the datetime strings in the columns, and if it can be inferred,
205207
switch to a faster method of parsing them. In some cases this can increase
206208
the parsing speed by 5-10x.
207-
keep_date_col : boolean, default False
209+
keep_date_col : bool, default False
208210
If True and `parse_dates` specifies combining multiple columns then
209211
keep the original columns.
210212
date_parser : function, default None
@@ -217,9 +219,9 @@
217219
and pass that; and 3) call `date_parser` once for each row using one or
218220
more strings (corresponding to the columns defined by `parse_dates`) as
219221
arguments.
220-
dayfirst : boolean, default False
221-
DD/MM format dates, international and European format
222-
iterator : boolean, default False
222+
dayfirst : bool, default False
223+
DD/MM format dates, international and European format.
224+
iterator : bool, default False
223225
Return TextFileReader object for iteration or getting chunks with
224226
``get_chunk()``.
225227
chunksize : int, default None
@@ -237,10 +239,10 @@
237239
.. versionadded:: 0.18.1 support for 'zip' and 'xz' compression.
238240
239241
thousands : str, default None
240-
Thousands separator
242+
Thousands separator.
241243
decimal : str, default '.'
242244
Character to recognize as decimal point (e.g. use ',' for European data).
243-
float_precision : string, default None
245+
float_precision : str, default None
244246
Specifies which converter the C engine should use for floating-point
245247
values. The options are `None` for the ordinary converter,
246248
`high` for the high-precision converter, and `round_trip` for the
@@ -253,7 +255,7 @@
253255
quoting : int or csv.QUOTE_* instance, default 0
254256
Control field quoting behavior per ``csv.QUOTE_*`` constants. Use one of
255257
QUOTE_MINIMAL (0), QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or QUOTE_NONE (3).
256-
doublequote : boolean, default ``True``
258+
doublequote : bool, default ``True``
257259
When quotechar is specified and quoting is not ``QUOTE_NONE``, indicate
258260
whether or not to interpret two consecutive quotechar elements INSIDE a
259261
field as a single ``quotechar`` element.
@@ -270,35 +272,35 @@
270272
encoding : str, default None
271273
Encoding to use for UTF when reading/writing (ex. 'utf-8'). `List of Python
272274
standard encodings
273-
<https://docs.python.org/3/library/codecs.html#standard-encodings>`_
275+
<https://docs.python.org/3/library/codecs.html#standard-encodings>`_ .
274276
dialect : str or csv.Dialect instance, default None
275277
If provided, this parameter will override values (default or not) for the
276278
following parameters: `delimiter`, `doublequote`, `escapechar`,
277279
`skipinitialspace`, `quotechar`, and `quoting`. If it is necessary to
278280
override values, a ParserWarning will be issued. See csv.Dialect
279281
documentation for more details.
280-
tupleize_cols : boolean, default False
282+
tupleize_cols : bool, default False
281283
.. deprecated:: 0.21.0
282284
This argument will be removed and will always convert to MultiIndex
283285
284286
Leave a list of tuples on columns as is (default is to convert to
285-
a MultiIndex on the columns)
286-
error_bad_lines : boolean, default True
287+
a MultiIndex on the columns).
288+
error_bad_lines : bool, default True
287289
Lines with too many fields (e.g. a csv line with too many commas) will by
288290
default cause an exception to be raised, and no DataFrame will be returned.
289291
If False, then these "bad lines" will dropped from the DataFrame that is
290292
returned.
291-
warn_bad_lines : boolean, default True
293+
warn_bad_lines : bool, default True
292294
If error_bad_lines is False, and warn_bad_lines is True, a warning for each
293295
"bad line" will be output.
294-
low_memory : boolean, default True
296+
low_memory : bool, default True
295297
Internally process the file in chunks, resulting in lower memory use
296298
while parsing, but possibly mixed type inference. To ensure no mixed
297299
types either set False, or specify the type with the `dtype` parameter.
298300
Note that the entire file is read into a single DataFrame regardless,
299301
use the `chunksize` or `iterator` parameter to return the data in chunks.
300-
(Only valid with C parser)
301-
memory_map : boolean, default False
302+
(Only valid with C parser).
303+
memory_map : bool, default False
302304
If a filepath is provided for `filepath_or_buffer`, map the file object
303305
directly onto memory and access the data directly from there. Using this
304306
option can improve performance because there is no longer any I/O overhead.
@@ -320,12 +322,13 @@
320322
tool, ``csv.Sniffer``. In addition, separators longer than 1 character and
321323
different from ``'\s+'`` will be interpreted as regular expressions and
322324
will also force the use of the Python parsing engine. Note that regex
323-
delimiters are prone to ignoring quoted data. Regex example: ``'\r\t'``
325+
delimiters are prone to ignoring quoted data. Regex example: ``'\r\t'``.
324326
delimiter : str, default ``None``
325-
Alternative argument name for sep."""
327+
Alternative argument name for sep.
328+
"""
326329

327330
_read_csv_doc = """
328-
Read CSV (comma-separated) file into DataFrame
331+
Read CSV (comma-separated) file into DataFrame.
329332
330333
%s
331334
""" % (_parser_params % (_sep_doc.format(default="','"), _engine_doc))
@@ -1994,18 +1997,18 @@ def TextParser(*args, **kwds):
19941997
rows will be discarded
19951998
index_col : int or list, default None
19961999
Column or columns to use as the (possibly hierarchical) index
1997-
has_index_names: boolean, default False
2000+
has_index_names: bool, default False
19982001
True if the cols defined in index_col have an index name and are
1999-
not in the header
2002+
not in the header.
20002003
na_values : scalar, str, list-like, or dict, default None
20012004
Additional strings to recognize as NA/NaN.
20022005
keep_default_na : bool, default True
20032006
thousands : str, default None
20042007
Thousands separator
20052008
comment : str, default None
20062009
Comment out remainder of line
2007-
parse_dates : boolean, default False
2008-
keep_date_col : boolean, default False
2010+
parse_dates : bool, default False
2011+
keep_date_col : bool, default False
20092012
date_parser : function, default None
20102013
skiprows : list of integers
20112014
Row numbers to skip
@@ -2016,15 +2019,15 @@ def TextParser(*args, **kwds):
20162019
either be integers or column labels, values are functions that take one
20172020
input argument, the cell (not column) content, and return the
20182021
transformed content.
2019-
encoding : string, default None
2022+
encoding : str, default None
20202023
Encoding to use for UTF when reading/writing (ex. 'utf-8')
2021-
squeeze : boolean, default False
2022-
returns Series if only one column
2023-
infer_datetime_format: boolean, default False
2024+
squeeze : bool, default False
2025+
returns Series if only one column.
2026+
infer_datetime_format: bool, default False
20242027
If True and `parse_dates` is True for a column, try to infer the
20252028
datetime format based on the first datetime string. If the format
20262029
can be inferred, there often will be a large parsing speed-up.
2027-
float_precision : string, default None
2030+
float_precision : str, default None
20282031
Specifies which converter the C engine should use for floating-point
20292032
values. The options are None for the ordinary converter,
20302033
'high' for the high-precision converter, and 'round_trip' for the

0 commit comments

Comments
 (0)