Skip to content

to_csv() fail on 0.11.dev #3163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gdraps opened this issue Mar 25, 2013 · 14 comments · Fixed by #3166
Closed

to_csv() fail on 0.11.dev #3163

gdraps opened this issue Mar 25, 2013 · 14 comments · Fixed by #3166
Labels
Bug IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@gdraps
Copy link
Contributor

gdraps commented Mar 25, 2013

Hit this after updating to '0.11.0.dev-da54321' from master. Haven't had a chance to dig any deeper, other than isolate frame length as a factor.

df = pandas.util.testing.makeTimeDataFrame(25000)
df.to_csv("save.csv")  # works
df = pandas.util.testing.makeTimeDataFrame(25001)
df.to_csv("save.csv")  # throws exception below

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-83-12cc25e3eafd> in <module>()
----> 1 df.to_csv("save.csv")

/usr/local/lib/python2.7/dist-packages/pandas-0.11.0.dev_da54321-py2.7-linux-i686.egg/pandas/core/frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, cols, header, index, index_label, mode, nanRep, encoding, quoting, line_terminator, chunksize, **kwds)
   1348                                          index_label=index_label,
   1349                                          chunksize=chunksize,legacy=kwds.get("legacy",False) )
-> 1350             formatter.save()
   1351 
   1352     def to_excel(self, excel_writer, sheet_name='sheet1', na_rep='',

/usr/local/lib/python2.7/dist-packages/pandas-0.11.0.dev_da54321-py2.7-linux-i686.egg/pandas/core/format.pyc in save(self)
    936 
    937             else:
--> 938                 self._save()
    939 
    940 

/usr/local/lib/python2.7/dist-packages/pandas-0.11.0.dev_da54321-py2.7-linux-i686.egg/pandas/core/format.pyc in _save(self)
   1008                 break
   1009 
-> 1010             self._save_chunk(start_i, end_i)
   1011 
   1012     def _save_chunk(self, start_i, end_i):

/usr/local/lib/python2.7/dist-packages/pandas-0.11.0.dev_da54321-py2.7-linux-i686.egg/pandas/core/format.pyc in _save_chunk(self, start_i, end_i)
   1029         ix = data_index.to_native_types(slicer=slicer, na_rep=self.na_rep, float_format=self.float_format)
   1030 
-> 1031         lib.write_csv_rows(self.data, ix, self.nlevels, self.cols, self.writer)
   1032 
   1033 # from collections import namedtuple

/usr/local/lib/python2.7/dist-packages/pandas-0.11.0.dev_da54321-py2.7-linux-i686.egg/pandas/lib.so in pandas.lib.write_csv_rows (pandas/lib.c:13152)()

IndexError: list index out of range

Current workaround:

df.to_csv("save.csv", legacy=True)
@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

platform and numpy version?

@gdraps
Copy link
Contributor Author

gdraps commented Mar 25, 2013

i386 GNU/Linux and numpy 1.6.2

@ghost
Copy link

ghost commented Mar 25, 2013

22f258f fixed a very similar boundry+1 issue for multiindex, due to a slicer arg being ignored
in MultiIndex to_native_types? maybe DateTimeIndex has the same issue?

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

it's 1 chunk plus 1 row

will take a look

@ghost
Copy link

ghost commented Mar 25, 2013

I got it.

core/index:to_native_types
        if self.is_all_dates:
            return _date_formatter(self)
        else:
            values[mask] = na_rep

should be

        if self.is_all_dates:
            return _date_formatter(self[slicer])
        else:
            values[mask] = na_rep

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

by visual inspection

to_native_types in core/index

is_all_dates is returning the formatted for self, should be values

will add test and fix tom

@gdraps thanks for the report

@ghost
Copy link

ghost commented Mar 25, 2013

take it away jeff.... :)

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

faster than me!

use values instead of self[slicer]
already computed

@ghost
Copy link

ghost commented Mar 25, 2013

it's a view isn't it? you do it. I'll beef up the torture test, I'm not sure I tested DateTimeIndex,
maybe just TimeStamp Objects.

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

it's a view

@ghost
Copy link

ghost commented Mar 25, 2013

test at 8386da9

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

closed by #3166

@y-p I put in a separate test for this (marked slow), but pls merge yours as well

@jreback jreback closed this as completed Mar 25, 2013
@ghost
Copy link

ghost commented Mar 25, 2013

Thanks, will do.

EDIT: 886c3c7

@ghost
Copy link

ghost commented Mar 26, 2013

fyi, legacy=True has been replaced by engine='python', to be consistent with the c_parser convention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants