Skip to content

Commit 7e159ae

Browse files
Licht-TTomAugspurger
authored andcommitted
TST: Add the default separator test for PythonParser (pandas-dev#17822)
* TST: Add the default separator test for PythonParser * DOC: Add comment of Python CSV default separator test * DOC: Add the document about how PythonParser sniffing the separator
1 parent 9772c22 commit 7e159ae

File tree

3 files changed

+14
-2
lines changed

3 files changed

+14
-2
lines changed

doc/source/io.rst

+2-1
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,8 @@ filepath_or_buffer : various
8484
sep : str, defaults to ``','`` for :func:`read_csv`, ``\t`` for :func:`read_table`
8585
Delimiter to use. If sep is ``None``, the C engine cannot automatically detect
8686
the separator, but the Python parsing engine can, meaning the latter will be
87-
used automatically. In addition, separators longer than 1 character and
87+
used and automatically detect the separator by Python's builtin sniffer tool,
88+
:class:`python:csv.Sniffer`. In addition, separators longer than 1 character and
8889
different from ``'\s+'`` will be interpreted as regular expressions and
8990
will also force the use of the Python parsing engine. Note that regex
9091
delimiters are prone to ignoring quoted data. Regex example: ``'\\r\\t'``.

pandas/io/parsers.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -316,7 +316,8 @@
316316
_sep_doc = r"""sep : str, default {default}
317317
Delimiter to use. If sep is None, the C engine cannot automatically detect
318318
the separator, but the Python parsing engine can, meaning the latter will
319-
be used automatically. In addition, separators longer than 1 character and
319+
be used and automatically detect the separator by Python's builtin sniffer
320+
tool, ``csv.Sniffer``. In addition, separators longer than 1 character and
320321
different from ``'\s+'`` will be interpreted as regular expressions and
321322
will also force the use of the Python parsing engine. Note that regex
322323
delimiters are prone to ignoring quoted data. Regex example: ``'\r\t'``

pandas/tests/io/parser/python_parser_only.py

+10
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,16 @@
1919

2020
class PythonParserTests(object):
2121

22+
def test_default_separator(self):
23+
# GH17333
24+
# csv.Sniffer in Python treats 'o' as separator.
25+
text = 'aob\n1o2\n3o4'
26+
expected = DataFrame({'a': [1, 3], 'b': [2, 4]})
27+
28+
result = self.read_csv(StringIO(text), sep=None)
29+
30+
tm.assert_frame_equal(result, expected)
31+
2232
def test_invalid_skipfooter(self):
2333
text = "a\n1\n2"
2434

0 commit comments

Comments
 (0)