API/DOC: status of low_memory kwarg of read_csv/table #5888

cancan101 · 2014-01-09T05:55:57Z

I am getting the following warning:

/usr/local/lib/python2.7/dist-packages/pandas-0.13.0rc1_78_g142ca62-py2.7-linux-x86_64.egg/pandas/io/parsers.py:1050: 
DtypeWarning: Columns (6) have mixed types. Specify dtype option on import or set low_memory=False.
  data = self._reader.read(nrows)

but I can find no documentation for low_memory

The text was updated successfully, but these errors were encountered:

jreback · 2014-01-09T15:48:48Z

its a kind of deprecated option (but still works)

cancan101 · 2014-01-09T22:46:18Z

If the low_memory parameter is deprecated, it should be marked as such. Also the warning message should be removed or re-worded.

jreback · 2014-01-09T22:48:10Z

I said kind of deprecated in that I don't think it's necessary anymore

randyzwitch · 2014-03-25T17:29:50Z

As another data point, I got this warning about mixed types. Setting to low_memory=False as suggested actually crashed Python (Win7 64-bit, through IPython Notebook). I'm nowhere near my memory limits, so removing the argument and keeping the warning was no big deal for me.

jorisvandenbossche · 2014-07-07T08:20:36Z

There is also an example in the book "Python for Data Analysis" that leads to this warning (p278):

In [13]: fec = pd.read_csv('ch09/P00000001-ALL.csv')

C:\Anaconda\lib\site-packages\pandas\io\parsers.py:1130: DtypeWarning: Columns (6) have mixed types. Specify dtype option on import or set low_memory=False.
  data = self._reader.read(nrows)

So I agree with above:

or the option is deprecated, and then the mention in the warning should be removed
or the option is not deprecated and should be documented (and should not crash your python)

From the code, it does not seem like deprecated (it is still used: https://github.com/pydata/pandas/blob/master/pandas/parser.pyx#L727), but it seems that it is given a default value of True in read_csv regardless of what you specify (https://github.com/pydata/pandas/blob/master/pandas/io/parsers.py#L354). There are also still tests specifically for high memery: https://github.com/pydata/pandas/blob/master/pandas/io/tests/test_parsers.py#L2897

jorisvandenbossche · 2015-02-27T12:42:36Z

@mdmueller @selasley As csv parser experts, somebody interested in looking into this? (What does low_memory do exactly? Do we still need it (should it be deprecated or not)? And depending on that, document it or really deprecate it (and remove as suggestion in one of the warnings).

This came up again at SO: http://stackoverflow.com/questions/28697501/how-to-know-line-and-col-when-the-read-csv-method-of-pandas-thows-exception/28702078#28702078

jondo · 2016-11-09T16:53:35Z

Why is this not visible in the online documentation yet?

Should this documentation also be added for pandas.read_table (which has the same behavior)?

jreback · 2016-11-09T17:12:50Z

it's in the 0.19.0 and greater docs
https://github.com/pandas-dev/pandas/pull/13293/files

jondo · 2016-11-11T15:38:06Z

Will this change also become visible in the pd.read_table documentation?

jorisvandenbossche · 2016-11-11T17:16:42Z

Ah, apparantly there is something wrong with the read_csv page. This is still from 0.18.1, although the main docs under 'stable' are for 0.19.1. So, @jreback apparently something went wrong when I uploaded the docs and the generated pages were not updated

jorisvandenbossche · 2016-11-11T18:53:08Z

@jondo docs should be fixed now (be sure to refresh your browser). Thanks for noticing!

jondo · 2016-11-11T19:28:50Z

It's me who has to thank you!
Yes, now the pages of read_csv and read_table both contain the added text.

jreback added Bug labels Mar 29, 2014

jreback modified the milestones: 0.15.0, 0.14.0 Mar 29, 2014

jorisvandenbossche modified the milestones: 0.15.0, 0.15.1 Jul 7, 2014

jreback modified the milestones: 0.15.0, 0.15.1 Sep 8, 2014

jorisvandenbossche changed the title ~~low_memory on read_table and read_csv is undocumented~~ API/DOC: status of low_memory kwarg of read_csv/table Feb 27, 2015

jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015

Medeah mentioned this issue Oct 31, 2015

Update parser.pyx #11491

Closed

chris-b1 mentioned this issue May 26, 2016

DOC: low_memory in read_csv #13293

Closed

2 tasks

jreback closed this as completed in 4b05055 May 26, 2016

diegoquintanav mentioned this issue Aug 3, 2018

low_memory=True in read_csv leads to non documented, silent errors #22194

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API/DOC: status of low_memory kwarg of read_csv/table #5888

API/DOC: status of low_memory kwarg of read_csv/table #5888

cancan101 commented Jan 9, 2014

jreback commented Jan 9, 2014

cancan101 commented Jan 9, 2014

jreback commented Jan 9, 2014

randyzwitch commented Mar 25, 2014

jorisvandenbossche commented Jul 7, 2014

jorisvandenbossche commented Feb 27, 2015

jondo commented Nov 9, 2016

jreback commented Nov 9, 2016

jondo commented Nov 11, 2016

jorisvandenbossche commented Nov 11, 2016

jorisvandenbossche commented Nov 11, 2016

jondo commented Nov 11, 2016

API/DOC: status of low_memory kwarg of read_csv/table #5888

API/DOC: status of low_memory kwarg of read_csv/table #5888

Comments

cancan101 commented Jan 9, 2014

jreback commented Jan 9, 2014

cancan101 commented Jan 9, 2014

jreback commented Jan 9, 2014

randyzwitch commented Mar 25, 2014

jorisvandenbossche commented Jul 7, 2014

jorisvandenbossche commented Feb 27, 2015

jondo commented Nov 9, 2016

jreback commented Nov 9, 2016

jondo commented Nov 11, 2016

jorisvandenbossche commented Nov 11, 2016

jorisvandenbossche commented Nov 11, 2016

jondo commented Nov 11, 2016