Msgpack - ValueError: buffer source array is read-only #11880

jeetjitsu · 2015-12-21T20:00:16Z

I get the Value error when processing data using pandas. I followed the following steps:

convert to msgpack format with compress flag
subsequently read file into a dataframe
push to sql table with to_sql

On the third step i get ValueError: buffer source array is read-only.

This problem does not arise if I wrap the read_msgpack call inside a pandas.concat

Example

import pandas as pd
import numpy as np

from sqlalchemy import create_engine

eng = create_engine("sqlite:///:memory:")

df1 = pd.DataFrame({ 'A' : 1.,
                                   'B' : pd.Timestamp('20130102'),
                                   'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                                   'D' : np.array([3] * 4,dtype='int32'),
                                   'E' : 'foo' })

df1.to_msgpack('test.msgpack', compress='zlib')
df2 = pd.read_msgpack('test.msgpack')

df2.to_sql('test', eng, if_exists='append', chunksize=1000) # throws value error

df2 = pd.cooncat([pd.read_msgpack('test.msgpack')])

df2.to_sql('test', eng, if_exists='append', chunksize=1000) # works

This happens with both blosc and zlib compression. While I have found a solution, this behaviour seems very odd and for very large files there is a small performance hit.

edit: @TomAugspurger changed the sql engine to sqlite

jreback · 2015-12-21T20:09:37Z

pls pd.show_versions()

TomAugspurger · 2015-12-21T20:10:24Z

replace eng = create_engine("mysql+mysqldb://user:pass@localhost/dbname") with eng = create_engine("sqlite:///:memory:") to make this easier to reproduce (still raises)

jeetjitsu · 2015-12-22T01:44:23Z

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.0.final.0
python-bits: 64
OS: Linux
OS-release: 4.2.0-21-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_IN

pandas: 0.17.1
nose: 1.3.7
pip: 7.1.2
setuptools: 18.2
Cython: 0.23.4
numpy: 1.10.2
scipy: 0.16.1
statsmodels: None
IPython: 4.0.1
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: 1.2.8
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.10
pymysql: None
psycopg2: None
Jinja2: None

jreback · 2015-12-26T01:02:15Z

I think we need to tell numpy to take ownership of the data, maybe np.array with copy=False around the np.frombuffer. @shoyer how does one normally do this?

In [9]: df2._data.blocks[0].values.flags
Out[9]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : False
  ALIGNED : False
  UPDATEIFCOPY : False

In [10]: df1._data.blocks[0].values.flags
Out[10]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

kawochen · 2016-01-11T00:54:49Z

df2['E'].a exhibits the bug as well.

jreback added Bug Msgpack labels Dec 26, 2015

jreback added this to the Next Major Release milestone Dec 26, 2015

jreback added Difficulty Intermediate labels Dec 26, 2015

kawochen mentioned this issue Jan 10, 2016

BUG: GH11880 where __contains__ fails in unpacked DataFrame with object cols #12013

Merged

jreback modified the milestones: 0.18.0, Next Major Release Jan 11, 2016

jreback closed this as completed in #12013 Jan 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Msgpack - ValueError: buffer source array is read-only #11880

Msgpack - ValueError: buffer source array is read-only #11880

jeetjitsu commented Dec 21, 2015

jreback commented Dec 21, 2015

TomAugspurger commented Dec 21, 2015

jeetjitsu commented Dec 22, 2015

jreback commented Dec 26, 2015

kawochen commented Jan 11, 2016

Msgpack - ValueError: buffer source array is read-only #11880

Msgpack - ValueError: buffer source array is read-only #11880

Comments

jeetjitsu commented Dec 21, 2015

jreback commented Dec 21, 2015

TomAugspurger commented Dec 21, 2015

jeetjitsu commented Dec 22, 2015

INSTALLED VERSIONS

jreback commented Dec 26, 2015

kawochen commented Jan 11, 2016