BUG/API: can't pass parameters to csv module via df.to_csv #4528

brechea · 2013-08-09T21:05:54Z

Trying to print a data frame as plain, strict tsv (i.e., no quoting and no escaping, because I know none the fields will contain tabs), I wanted to use the "quoting" option, which is documented in pandas and is passed through to csv, as well as the "quotechar" option, not documented in pandas but also a csv option. But it doesn't work:

In [1]: import sys, csv

In [2]: from pandas import DataFrame

In [3]: data = {'col1': ['contents of col1 row1', 'contents " of col1 row2'], 'col2': ['contents of col2 row1', 'contents " of col2 row2'] }

In [4]: df = DataFrame(data)

In [5]: df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)
        col1    col2
0       contents of col1 row1   contents of col2 row1
---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
<ipython-input-5-a30d32266fb4> in <module>()
----> 1 df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, cols, header, index, index_label, mode, nanRep, encoding, quoting, line_terminator, chunksize, tupleize_cols, **kwds)
   1409                                      tupleize_cols=tupleize_cols,
   1410                                      )
-> 1411         formatter.save()
   1412
   1413     def to_excel(self, excel_writer, sheet_name='sheet1', na_rep='',

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in save(self)
    974
    975             else:
--> 976                 self._save()
    977
    978

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save(self)
   1080                 break
   1081
-> 1082             self._save_chunk(start_i, end_i)
   1083
   1084     def _save_chunk(self, start_i, end_i):

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/core/format.pyc in _save_chunk(self, start_i, end_i)
   1098         ix = data_index.to_native_types(slicer=slicer, na_rep=self.na_rep, float_format=self.float_format)
   1099
-> 1100         lib.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer)
   1101
   1102 # from collections import namedtuple

/home/brechea/.local/lib/python2.6/site-packages/pandas-0.12.0-py2.6-linux-x86_64.egg/pandas/lib.so in pandas.lib.write_csv_rows (pandas/lib.c:13871)()

Error: need to escape, but no escapechar set

Adding the parameter

quotechar=kwds.get("quotechar")

to the

formatter = fmt.CSVFormatter(...

call in to_csv(), and doing corresponding changes to format.CSVFormatter()'s init() and save(), produces the expected output:

In [1]: import sys, csv

In [2]: from pandas import DataFrame

In [3]: data = {'col1': ['contents of col1 row1', 'contents " of col1 row2'], 'col2': ['contents of col2 row1', 'contents " of col2 row2'] }

In [4]: df = DataFrame(data)

In [5]: df.to_csv(sys.stdout, sep='\t', quoting=csv.QUOTE_NONE, quotechar=None)
        col1    col2
0       contents of col1 row1   contents of col2 row1
1       contents " of col1 row2 contents " of col2 row2

i.e., unescaped, unquoted tsv.

More generally, there could be many reasons to want more control of the underlying csv writer, so a generic mechanism (as opposed to adding each param one by one) might be called for (e.g., allowign for a csv dialect object or at least a dictionary holding dialect attributes).

jreback · 2013-08-10T12:37:59Z

yep...would be nice to add this parameter (and you are right, dialect would also be nice to pass, which if not None could control the values of other parms). Can you do a PR to add those (with tests!)

also I believe the doc string needs to be updated in to_csv and the docs in io.rst.

thanks!

brechea · 2013-08-13T18:08:10Z

I'm not sure what a PR is, but I assume a pull request, given this is github? I confess I don't have a git repo of pandas, and did the above very quickly. I simply hacked the above mentioned modules directly so at least I could demonstrate the before and after behavior. I'll look into doing things properly, but it may take a while.

jreback · 2013-08-13T18:12:39Z

have a look at this: http://pandas.pydata.org/developers.html

jreback · 2013-09-20T17:22:27Z

@brechea giving this a shot?

jreback · 2013-09-30T12:39:06Z

@brechea ping!

jreback · 2014-02-18T19:46:30Z

this is closed by #5414

@brechea pls try this again on master...if you are still experiencing issues...let us know

jreback closed this as completed Sep 21, 2013

jreback reopened this Sep 21, 2013

patricktokeeffe mentioned this issue Jan 31, 2014

ENH/BUG: pass formatting params thru to to_csv #5414

Merged

jreback closed this as completed Feb 18, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG/API: can't pass parameters to csv module via df.to_csv #4528

BUG/API: can't pass parameters to csv module via df.to_csv #4528

brechea commented Aug 9, 2013

jreback commented Aug 10, 2013

brechea commented Aug 13, 2013

jreback commented Aug 13, 2013

jreback commented Sep 20, 2013

jreback commented Sep 30, 2013

jreback commented Feb 18, 2014

BUG/API: can't pass parameters to csv module via df.to_csv #4528

BUG/API: can't pass parameters to csv module via df.to_csv #4528

Comments

brechea commented Aug 9, 2013

jreback commented Aug 10, 2013

brechea commented Aug 13, 2013

jreback commented Aug 13, 2013

jreback commented Sep 20, 2013

jreback commented Sep 30, 2013

jreback commented Feb 18, 2014