read_pickle error for multi-index: 'FrozenList' does not support mutable operations. #4788

wcbeard · 2013-09-09T20:36:09Z

I'm in a bit of a pickle here. If I try to save and read back a multi-indexed dataframe, I get this error (and in some situations, can't reproduce when, I get a TypeError: Required argument 'shape' (pos 1) not found error).

The gist with the full traceback is here.

In [3]: import numpy as np

In [4]: import pandas as pd

In [5]: np.random.seed(10)

In [6]: a = np.random.randint(0, 20, (6, 5))

In [7]: df = pd.DataFrame(a).set_index([2,3,4])

In [8]: df
Out[8]:
           0   1
2  3  4
15 0  17   9   4
8  9  0   16  17
4  19 16  10   8
11 11 1    4  15
14 17 19   8   4
13 19 13  13   5

In [9]: df.to_pickle('~/Desktop/dummy.df')

In [10]: df2 = pd.read_pickle('~/Desktop/dummy.df')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
...
TypeError: 'FrozenList' does not support mutable operations.

When I reset_index it saves fine. I'm on Mac OSX, numpy 1.7, pandas 0.12.0-361-g53eec08.

In the meantime, if there's no quick fix, anyone know another way to save multi-indexes to file? csv doesn't look like it can preserve them. I could do something like appending _index to the columns and un-rename them after reading but would prefer something less hacky.

The text was updated successfully, but these errors were encountered:

jreback · 2013-09-09T20:43:47Z

try with current master; this fixed the pickle issue: 725b195

jreback · 2013-09-09T20:45:11Z

you can save mi for both index/colmns via csv since 0.12, see here: http://pandas.pydata.org/pandas-docs/dev/io.html#reading-columns-with-a-multiindex

here for just mi on the index (ths has been for a while): http://pandas.pydata.org/pandas-docs/dev/io.html#reading-an-index-with-a-multiindex

wcbeard · 2013-09-09T20:55:07Z

pd.version.version tells me I'm on 0.12.0-361-g53eec08. Is that not the most recent commit? Is it possible it's really a different version installed? (I git pull'd today and did pip uninstall pandas + pip install . in the local repo).

jtratner · 2013-09-09T21:00:19Z

That said, I would not be surprised if the new MultiIndex setup needed to define some of the pickle magic methods, since we changed up its internal representation. Are there tests for roundtripping MI through pickle?

jreback · 2013-09-09T21:10:09Z

@jtratner should be from 0.10 on....I saved a pickled version from each (and 0.13)....

@d10genes do you see that commit I referenced (just git log and searchfor pickle); i think it was 3-4 days ago

@d10genes that error is a fall thru; I think there is another error

can you debug thru pdb and step thru the read_piickle ?

basically it tries the original pickle, then the fallback, then a version with an encoding, then fallback with encoding

are you using 2.7? (or a 3x) python?

jreback · 2013-09-09T21:10:27Z

@d10genes can you post a sample code?

wcbeard · 2013-09-09T21:12:20Z

@jreback This shows up in git log

commit 725b1951249a795fe01896dff4ce46bd9206021f
Merge: 2267fe4 0436809
Author: jreback <[email protected]>
Date:   Fri Sep 6 16:06:27 2013 -0700

    Merge pull request #4755 from jreback/pickle_compat

    BUG: TimeSeries compat from < 0.13

wcbeard · 2013-09-09T21:12:58Z

As far as sample code, how do you want it different from the OP?

jreback · 2013-09-09T21:18:18Z

@d10genes sorry...you put it up already....hold on

jreback · 2013-09-09T21:22:32Z

I have a test for a mi and a frame with several kinds of index...but of course not a mi....let me fix.

thanks for the report! (some of this pickle code was pretty tricky and was trying to cover all the bases!)

wcbeard · 2013-09-09T21:29:25Z

Ok, thanks. And out of curiosity, would the underlying code for pickling treat Series and DataFrames differently? Going off of my above code, it looks like series MI pickling sets off that other error:

In [13]: s = df[0]

In [14]: s
Out[14]:
2   3   4
15  0   17     9
8   9   0     16
4   19  16    10
11  11  1      4
14  17  19     8
13  19  13    13
Name: 0, dtype: int64
In [16]: s.to_pickle('~/Desktop/s.df')
In [17]: s2 = pd.read_pickle('~/Desktop/s.df')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
...
TypeError: Required argument 'shape' (pos 1) not found

jreback · 2013-09-09T21:34:48Z

no...the problem is assigning the index to the pandas object, same code for all (but the deserialization of the MI is not 'defined' correctly)

jtratner · 2013-09-09T21:46:08Z

@jreback if you want me to look at it, I can... I have a hunch. (but you
usually figure these things out quickly anyways :))

jreback · 2013-09-09T22:09:02Z

i have the test case set and will look in a few

jreback · 2013-09-09T22:52:39Z

@jtratner

ok..this branch, last commti shows the test case: https://github.com/jreback/pandas/tree/pickle_fix

a bunch of things are commented out...but this will fail on a disabled method in FrozenList

pickle is trying to extend the list....

a possible solution is to have a context manager with certain classes (or can just do it with a try:except:finally (e.g. FrozenList), which reenables all/certain methods and then redisables them.....

jtratner · 2013-09-09T23:20:41Z

@jreback yeah, I was noticing that... would it work to just define __reduce__() like this?

def __reduce__(self):
    return self.__class__, (list(self),)

jtratner · 2013-09-09T23:24:16Z

btw - how come pickle.loads(pickle.dumps(df)) works, but not round tripping with read_pickle and to_pickle?

jtratner · 2013-09-09T23:28:15Z

@jreback yep, that resolves it - much simpler.

jtratner · 2013-09-09T23:30:23Z

only issue is whether we have to support legacy pickles from the time between the FrozenList addition and now.

wcbeard · 2013-09-10T01:23:17Z

@jreback I can step through tomorrow morning if it'd still be helpful. And I'm using 2.7

jreback · 2013-09-10T01:52:37Z

@d10genes thanks
I believe the PR #4791 fixes this
certain properties of the multi index were changes for 0.13 to make is immutable (was supposed to be before but it could be changed) and the resulting object didn't pickle properly

that said glad you caught this was able to put some additional tests in place to ensure compat (which actually is a big deal as 0.13 changes a lot internally), including some hoops that needed jumping for Series (which is not not a subclass of ndarray)

in any event will be merging soon (prob tomorrow) so keep a look out, @jtratner is fixing a couple of more things

also pls try out the csv features I mentioned above if u can

jreback · 2013-09-10T13:06:20Z

@d10genes all merged in, master should work for you now...thanks!

wcbeard · 2013-09-10T13:52:51Z

Awesome, that fixed it. Thanks a lot, I really appreciate it.

And regarding the csv features, I did try them out and they seemed to work, but I don't think it's possible with CSV to automatically encode which columns are indices (looks like pandas needs you to use the column names as read_csv parameters).

Pickling seems to take care of it all though. Thanks!

jreback · 2013-09-10T14:04:40Z

@d10genes great....yep...have to specify column names, unfortunately csv is not a roundtripable format (w/o some user parameters).
you might also try HDFStore, supports multi-indexes and is roundtripable as the meta data is stored

jreback mentioned this issue Sep 9, 2013

BUG: pickle failing on FrozenList, when using MultiIndex (GH4788) #4791

Merged

jreback closed this as completed in #4791 Sep 10, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read_pickle error for multi-index: 'FrozenList' does not support mutable operations. #4788

read_pickle error for multi-index: 'FrozenList' does not support mutable operations. #4788

wcbeard commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

wcbeard commented Sep 9, 2013

jtratner commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

wcbeard commented Sep 9, 2013

wcbeard commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

wcbeard commented Sep 9, 2013

jreback commented Sep 9, 2013

jtratner commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

jtratner commented Sep 9, 2013

jtratner commented Sep 9, 2013

jtratner commented Sep 9, 2013

jtratner commented Sep 9, 2013

wcbeard commented Sep 10, 2013

jreback commented Sep 10, 2013

jreback commented Sep 10, 2013

wcbeard commented Sep 10, 2013

jreback commented Sep 10, 2013

read_pickle error for multi-index: 'FrozenList' does not support mutable operations. #4788

read_pickle error for multi-index: 'FrozenList' does not support mutable operations. #4788

Comments

wcbeard commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

wcbeard commented Sep 9, 2013

jtratner commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

wcbeard commented Sep 9, 2013

wcbeard commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

wcbeard commented Sep 9, 2013

jreback commented Sep 9, 2013

jtratner commented Sep 9, 2013

jreback commented Sep 9, 2013

jreback commented Sep 9, 2013

jtratner commented Sep 9, 2013

jtratner commented Sep 9, 2013

jtratner commented Sep 9, 2013

jtratner commented Sep 9, 2013

wcbeard commented Sep 10, 2013

jreback commented Sep 10, 2013

jreback commented Sep 10, 2013

wcbeard commented Sep 10, 2013

jreback commented Sep 10, 2013