Core dumped in read_csv (C engine) when reading multiple corrupted gzip files

I am using read_csv to read some gzip compressed log files. Some of these files are corrupted and they cannot be uncompressed. 
At different iterations in the loop that reads these files my script crashes with a core dumped message:  
**\* Error in `/usr/bin/python': corrupted double-linked list: 0x0000000003836790 ***
or just:
Segmentation fault (core dumped)

This is a stripped-down version (just looping over one of the corrupted files) of the code where this error occurs:

```
import pandas as pd

for i in xrange(n):
    try:
         pd.read_csv(fPath,delim_whitespace=True,header=None, compression='gzip')
    except Exception,e:
         continue
```

The traceback of the catched exception is:

File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 498, in parser_f 
   return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 285, in _read 
   return parser.read()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 747, in read 
   ret = self._engine.read(nrows)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1197, in read 
   data = self._reader.read(nrows)
File "pandas/parser.pyx", line 766, in pandas.parser.TextReader.read (pandas/parser.c:7988)
File "pandas/parser.pyx", line 788, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:8244)
File "pandas/parser.pyx", line 842, in pandas.parser.TextReader._read_rows (pandas/parser.c:8970)
File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8838)
File "pandas/parser.pyx", line 1833, in pandas.parser.raise_parser_error (pandas/parser.c:22649)
CParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.

If I remove the delim_whitespace argument the loop completes without segmentation fault. I tried adding low_memory=False but the program still crashes. 
I am using pandas version 0.17.1 on Ubuntu 14.04 OS. 

It looks like a similar issue to #5664 but the problems should have been resolved in v0.16.1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Core dumped in read_csv (C engine) when reading multiple corrupted gzip files #12098

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Core dumped in read_csv (C engine) when reading multiple corrupted gzip files #12098

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions