Skip to content

Commit 63fb381

Browse files
roberthdevriesSeeminSyed
authored andcommitted
BUG: Fix read_csv IndexError crash for c engine with header=None and 2 (or more) extra columns (pandas-dev#32839)
1 parent b7107a9 commit 63fb381

File tree

3 files changed

+13
-2
lines changed

3 files changed

+13
-2
lines changed

doc/source/whatsnew/v1.1.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -349,6 +349,7 @@ I/O
349349
- Bug in :meth:`read_csv` was causing a file descriptor leak on an empty file (:issue:`31488`)
350350
- Bug in :meth:`read_csv` was causing a segfault when there were blank lines between the header and data rows (:issue:`28071`)
351351
- Bug in :meth:`read_csv` was raising a misleading exception on a permissions issue (:issue:`23784`)
352+
- Bug in :meth:`read_csv` was raising an ``IndexError`` when header=None and 2 extra data columns
352353

353354

354355
Plotting

pandas/_libs/parsers.pyx

+2-2
Original file line numberDiff line numberDiff line change
@@ -1316,8 +1316,8 @@ cdef class TextReader:
13161316
else:
13171317
if self.header is not None:
13181318
j = i - self.leading_cols
1319-
# hack for #2442
1320-
if j == len(self.header[0]):
1319+
# generate extra (bogus) headers if there are more columns than headers
1320+
if j >= len(self.header[0]):
13211321
return j
13221322
else:
13231323
return self.header[0][j]

pandas/tests/io/parser/test_common.py

+10
Original file line numberDiff line numberDiff line change
@@ -2116,3 +2116,13 @@ def test_blank_lines_between_header_and_data_rows(all_parsers, nrows):
21162116
parser = all_parsers
21172117
df = parser.read_csv(StringIO(csv), header=3, nrows=nrows, skip_blank_lines=False)
21182118
tm.assert_frame_equal(df, ref[:nrows])
2119+
2120+
2121+
def test_no_header_two_extra_columns(all_parsers):
2122+
# GH 26218
2123+
column_names = ["one", "two", "three"]
2124+
ref = DataFrame([["foo", "bar", "baz"]], columns=column_names)
2125+
stream = StringIO("foo,bar,baz,bam,blah")
2126+
parser = all_parsers
2127+
df = parser.read_csv(stream, header=None, names=column_names, index_col=False)
2128+
tm.assert_frame_equal(df, ref)

0 commit comments

Comments
 (0)