Skip to content

msgpack unpack dataframe error: InvalidIndexError #9618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jmavila opened this issue Mar 9, 2015 · 4 comments · Fixed by #10527
Closed

msgpack unpack dataframe error: InvalidIndexError #9618

jmavila opened this issue Mar 9, 2015 · 4 comments · Fixed by #10527
Labels
Milestone

Comments

@jmavila
Copy link

jmavila commented Mar 9, 2015

This happen with some Pandas dataframes with indexes.

from pandas import read_msgpack
msg = df.to_msgpack()
read_msgpack(msg) #==> exception!

Full Traceback (most recent call last):

File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 163, in read_msgpack
return read(fh)
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 141, in read
l = list(unpack(fh))
File "pandas/msgpack.pyx", line 662, in pandas.msgpack.Unpacker.next (pandas/msgpack.cpp:8303)
File "pandas/msgpack.pyx", line 591, in pandas.msgpack.Unpacker._unpack (pandas/msgpack.cpp:7328)
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 490, in decode
blocks = [create_block(b) for b in obj['blocks']]
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 488, in create_block
placement=axes[0].get_indexer(b['items']))
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/core/index.py", line 1488, in get_indexer
raise InvalidIndexError('Reindexing only valid with uniquely'
InvalidIndexError: Reindexing only valid with uniquely valued Index objects

@jreback
Copy link
Contributor

jreback commented Mar 9, 2015

show pd.show_versions() a sample of the data frame and df.index, and df.index.is_unique

@jreback jreback added the Msgpack label Mar 9, 2015
@jmavila
Copy link
Author

jmavila commented Mar 9, 2015

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Linux
OS-release: 3.8.0-29-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.15.0

df.index

Int64Index([22, 29, 15, 11, 16, 43, 23, 30, 38, 24, 5, 9, 25, 7, 47, 4, 12, 26, 10, 17, 31, 39, 46, 6, 27, 18, 44, 41, 13, 8, 19, 20, 32, 52, 48, 0, 2, 3, 34, 35, 33, 49, 51, 1, 50], dtype='int64')

df.index.is_unique

result_df.index.is_unique == True

Sample dataframe

,d-37_id,d-37,m5010,m5010
22,29,201202,24330.47,24330.47
29,30,201203,56512.66,56512.66
15,31,201204,40812.07,40812.07
11,32,201205,115167.1,115167.1
16,33,201206,65513.84,65513.84
43,34,201207,20229.75,20229.75
23,35,201208,47008.03,47008.03
30,36,201209,69365.29,69365.29
38,37,201210,23771.42,23771.42
24,38,201211,61179.85,61179.85
5,39,201212,35658.55,35658.55
9,16,201301,162725.08,162725.08
25,17,201302,30242.13,30242.13
7,18,201303,100384.58,100384.58
47,19,201304,90338.54,90338.54
4,20,201305,75657.68,75657.68
12,21,201306,187771.41,187771.41
26,22,201307,70994.8,70994.8
10,23,201308,39255.37,39255.37
17,24,201309,232711.12,232711.12
31,25,201310,65818.43,65818.43
39,26,201311,184484.82,184484.82
46,27,201312,113374.1,113374.1
6,15,201401,129522.36,129522.36
27,40,201402,111226.45,111226.45
18,41,201403,64987.46,64987.46
44,42,201404,201649.92,201649.92
41,43,201405,44253.82,44253.82
13,44,201406,177624.59,177624.59
8,45,201407,51742.21,51742.21
19,62,201408,62129.06,62129.06
20,63,201409,196536.66,196536.66
32,64,201410,83913.08,83913.08
52,65,201411,6765.87,6765.87
48,66,201412,4585.0,4585.0
0,53,201501,282906.59,282906.59
2,54,201502,243472.38,243472.38
3,55,201503,150515.79,150515.79
34,56,201504,262266.21,262266.21
35,57,201505,57971.76,57971.76
33,58,201506,26562.02,26562.02
49,59,201507,53539.97,53539.97
51,70,201512,4231.5,4231.5
1,49,201601,120828.2,120828.2
50,71,201701,10448.92,10448.92

@jreback
Copy link
Contributor

jreback commented Mar 9, 2015

This works in 0.15.2

You can try to upgrade (I do recall this bug getting fixed, but can't find ATM).
or show an example which fails.

In [15]: pd.read_msgpack(df.to_msgpack())
Out[15]: 
    d-37_id    d-37      m5010    m5010.1
22       29  201202   24330.47   24330.47
29       30  201203   56512.66   56512.66

@filmor
Copy link
Contributor

filmor commented Mar 13, 2015

This does not seem to be fixed and happens when the column index is not unique:

pd.read_msgpack(pd.DataFrame(columns=["a", "a"]).to_msgpack())

I thought I had reported this error, will try to find it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants