Skip to content

BUG: pickle failing on FrozenList, when using MultiIndex (GH4788) #4791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 10, 2013

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Sep 9, 2013

closes #4788

@jtratner
Copy link
Contributor

jtratner commented Sep 9, 2013

@jreback did you see my comment? FrozenList never made it into a release, so we can go way simpler and just define a __reduce__ method...

@jreback
Copy link
Contributor Author

jreback commented Sep 9, 2013

yep...I just fixed it...much simpler that way....already done...now just going to regenerate the test pickles again....

@jtratner
Copy link
Contributor

jtratner commented Sep 9, 2013

great :)

@jtratner
Copy link
Contributor

jtratner commented Sep 9, 2013

and maybe we should leave a note to self to regenerate the legacy pickle right before we actually make the next release?

@jreback
Copy link
Contributor Author

jreback commented Sep 9, 2013

maybe....I only need to do it if I am actually changing the generate_legacy_pickles.py (which I did :)

in theory it should be stable (but in this case the writing of the pickle actually is different)

@@ -59,14 +69,14 @@ def read_pickles(self, version):
vf = os.path.join(pth,f)
self.compare(vf)

def test_read_pickles_0_10_1(self):
self.read_pickles('0.10.1')
#def test_read_pickles_0_10_1(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you comment these out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mistake....will update in a couple

@jreback
Copy link
Contributor Author

jreback commented Sep 9, 2013

hmm....py3 still give me an issue even with reduce.....let me see

@jtratner
Copy link
Contributor

@jreback

Might be easier to just change this:

    def __reduce__(self):
        """Necessary for making this object picklable"""
        object_state = list(np.ndarray.__reduce__(self))
        subclass_state = (self.levels, self.labels, self.sortorder, self.names)
        object_state[2] = (object_state[2], subclass_state)
        return tuple(object_state)

to:

    def __reduce__(self):
        """Necessary for making this object picklable"""
        object_state = list(np.ndarray.__reduce__(self))
        subclass_state = (list(self.levels), list(self.labels), self.sortorder, list(self.names))
        object_state[2] = (object_state[2], subclass_state)
        return tuple(object_state)

and just not deal with having FrozenList in the pickle at all...

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

ok...will try that...set_state looks good then? (in core/index.py)

@jtratner
Copy link
Contributor

let me look. I'm having trouble testing this locally... I get this error: ValueError: unsupported pickle protocol: 3

@jtratner
Copy link
Contributor

actually, it's not quite right...one sec.

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

@jtratner ok...I updated with my hack it works now (I put in the changes to reduce by no dice)...if you can get it to work w/o the hacks great.....lmk.....I will add 32-bit stuff test pickles as welll

@jtratner
Copy link
Contributor

will do -- the pickle error was just a weird thing on my end...

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

@jtratner alright current version passes travis!

I have a good selection of files....0.12: 2.7/3 for 64-bit linux/AMD, 0.12-dev: 2.7/3.2 for 32-bit/64-bit linux, 0.11 only 3.3

if you have any more arch..pls add, just run: pandas/io/tests/generate_legacy.py, but you have to copy it to the location or it will pick up the default system python (e.g. a build location)...then put the file in the appropriate folder and push...

@jtratner
Copy link
Contributor

@jreback when I reverted just the compat/pickle_compat.py part of this commit, it still passed Travis... do we need the hack? (It worked locally for me too...) https://travis-ci.org/jtratner/pandas/builds/11175350

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

hmm
if that works great
maybe I had a test file that had a pickles frozen list but cleaned it up

let me try that

@jtratner
Copy link
Contributor

this still does leave FrozenNDArray in there for the labels. Is that problematic? Otherwise we could have __reduce__() generate list of lists for labels and levels...

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

when I take out the hack, py3 fails on the 0.12-dev files (e.g. the current ones)....

@jtratner
Copy link
Contributor

@jreback you know - I guess I thought that it just ran through pickles for the current version no matter what. Do I need to generate the data first?

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

@jtratner no...all the files are there. it runs thru all files in each of the directories, then compares to the data that is in the generate_legacy_pickles (which in theory should work as far back as one would want to go, e.g. its not dependent on compat type stuff for example)

@jtratner
Copy link
Contributor

@jreback I mean, does it test that round-tripping with a pickle works in the current version? (i.e., it generates pickles for current version into a temporary file, runs them and then deletes them)

@jtratner
Copy link
Contributor

i.e.,checks that no one checks in changes like the FrozenList change that causes pickles to stop working...

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

@jtratner no doesn't round trip test per se (maybe could add that)....maybe that is better for the current version I suppose

It checks all of the static files (which I had generated for the current version)

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

@jtratner I pushed a round-trip test (which passes), but fails if I remove the hack in pickle_compat....so something is still happenign with frozen list.....maybe it needs a setstate ?

@jtratner
Copy link
Contributor

@jreback yeah, maybe. Sorry about the confusion there - i'll go figure it out now.

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

cool...lmk

@jtratner
Copy link
Contributor

@jreback I can't reproduce your issue on 3.2, 3.3 or 2.7. In all of them, I can roundtrip a dataframe with a MultiIndex using to_pickle and read_pickle without any issue. For that __reduce__() to work, it means breaking any pickles from the time 033a932 was committed through to today (about the past month). But I don't think any of that was in an official release so that should be fine. Are you sure that you are removing all of the 0.12-dev pickles?

For example, if you make a pickle off of current master, then tried to use this, it would fail.

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

all is the files in 0.11/0.12/0.13 are brand new I regenerated them as part of this pr

I just call the geneate_legacy_pickles.py script with whatever version

@jtratner
Copy link
Contributor

here's the passing Travis - https://travis-ci.org/jtratner/pandas/builds/11177267

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

hmm maybe I have an old pyc hanging out (that has the frozen list inside)

ok
I'll take it out
or u can

so u need set state in frozen list?

@jtratner
Copy link
Contributor

nope, you don't need set_state. it just calls the first argument with the second argument. (I think).

So I'm pretty sure it basically it does (when reduce returns a 2-tuple):

f = FrozenList([1, 2, 3, 4])
reduced = f.__reduce__() # (FrozenList, ([1, 2, 3, 4],))
reduced[0](*reduced[1])

@jtratner
Copy link
Contributor

Just to be clear why or whatnot:

class Something(object):
    def __init__(self, a, b):
        print("Something init called")
        self.a = a
        self.b = b
    def __repr__(self):
        return repr(self.__dict__)
    def __reduce__(self):
        return (Something, (self.a, self.b))

class Other(object):
    def __init__(self, a):
        print("Other Init called")
        self.a = a
        self.b = 2
    def __reduce__(self):
        return (Other, (self.a, 2))

import pickle
s = Something(1, 2)
r = pickle.dumps(s, protocol=-1)
s2 = pickle.loads(r)
print(s)
print(s2)
o = Other(1)
r = pickle.dumps(o, protocol=-1)
r2 = pickle.loads(r) # causes a TypeError because 'Other' called with 2 args

@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

@jtratner I took the hack out and now works (I also took 0.13 fixed file testing out, now just a round-trip test is done).
No explicty testing on FrozenNDArray nor FrozenList, do we need?

@jtratner
Copy link
Contributor

FrozenList shouldn't matter, because it shouldn't be ending up in pickles
anyways. If you're doing a MultiIndex, than you're already pickling
FrozenNDArray.

INT: add __reduce__ to FrozenList for pickle support

INT: add 32-bit 2.7/3.2 pickles

INT: add AMD64 0.12.0 pickle

TST: add round-trip test for current version

INT: remove 0.13 testing
@jreback
Copy link
Contributor Author

jreback commented Sep 10, 2013

ok..ready to merge then?

jreback added a commit that referenced this pull request Sep 10, 2013
BUG: pickle failing on FrozenList, when using MultiIndex (GH4788)
@jreback jreback merged commit 04314e6 into pandas-dev:master Sep 10, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_pickle error for multi-index: 'FrozenList' does not support mutable operations.
2 participants