always use UnicodeWriter for csv, default to utf-8 #2006

ghost · 2012-10-02T02:02:58Z

see long commit message for the rational.

maybe closes #1966 If the input is NOT pure ascii and no encoding is specified, the python stdlib csv module will die. if the input IS pure ascii, then using UnicodeWriter with utf-8 as encoding will produce the same end result as a pure ascii writer. This change will "just work" for more cases. also, presumably, internal representations of all text in pandas will eventually be unicode, so this meshes with that program too. there might be a performance issue for large files (is the python csv native?). If so, I think this still the way to go with the stdlib csv module becoming the optional path. a lot of issues have touched on csv and unicode, see #206,#300,#680,#705,#1966, probably more

ghost · 2012-10-02T12:43:00Z

should be ok now. The patch dovetails with the series in #2005.
is this ok, or should there be a keyword to force native csv?

ghost · 2012-10-11T15:55:49Z

withdrawn. bad idea.

ghost mentioned this pull request Oct 9, 2012

More Unicode, factor out pprinting of labels and names #2005

Merged

ghost closed this Oct 11, 2012

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

always use UnicodeWriter for csv, default to utf-8 #2006

always use UnicodeWriter for csv, default to utf-8 #2006

ghost commented Oct 2, 2012

ghost commented Oct 2, 2012

ghost commented Oct 11, 2012

always use UnicodeWriter for csv, default to utf-8 #2006

always use UnicodeWriter for csv, default to utf-8 #2006

Conversation

ghost commented Oct 2, 2012

ghost commented Oct 2, 2012

ghost commented Oct 11, 2012