You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In [1]: data1=pd.DataFrame(np.arange(20).reshape((4, 5)) +1, columns=['a', 'b', 'c', 'd', 'e'])
In [2]: data2=pd.DataFrame(np.arange(20).reshape((5, 4)) +1, columns=['a', 'b', 'x', 'y'])
In [3]: importpyarrowaspaIn [4]: d1=pa.deserialize(pa.serialize(data1).to_buffer())
In [5]: d2=pa.deserialize(pa.serialize(data2).to_buffer())
In [6]: d1.merge(d2)
Problem description
The above code raises an exception:
In [7]: d1.merge(d2)
---------------------------------------------------------------------------ValueErrorTraceback (mostrecentcalllast)
<ipython-input-8-f852b96f603a>in<module>---->1d1.merge(d2)
~/pandas/pandas/core/frame.pyinmerge(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
7261copy=copy,
7262indicator=indicator,
->7263validate=validate,
7264 )
7265~/pandas/pandas/core/reshape/merge.pyinmerge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
82validate=validate,
83 )
--->84returnop.get_result()
8586~/pandas/pandas/core/reshape/merge.pyinget_result(self)
625self.left, self.right=self._indicator_pre_merge(self.left, self.right)
626-->627join_index, left_indexer, right_indexer=self._get_join_info()
628629ldata, rdata=self.left._data, self.right._data~/pandas/pandas/core/reshape/merge.pyin_get_join_info(self)
842 )
843else:
-->844 (left_indexer, right_indexer) =self._get_join_indexers()
845846ifself.right_index:
~/pandas/pandas/core/reshape/merge.pyin_get_join_indexers(self)
821""" return the join indexers """822return_get_join_indexers(
-->823self.left_join_keys, self.right_join_keys, sort=self.sort, how=self.how824 )
825~/pandas/pandas/core/reshape/merge.pyin_get_join_indexers(left_keys, right_keys, sort, how, **kwargs)
12851286# get left & right join labels and num. of levels at each location->1287llab, rlab, shape=map(list, zip(*map(fkeys, left_keys, right_keys)))
12881289# get flat i8 keys from label lists~/pandas/pandas/core/reshape/merge.pyin_factorize_keys(lk, rk, sort)
1882rizer=klass(max(len(lk), len(rk)))
1883->1884llab=rizer.factorize(lk)
1885rlab=rizer.factorize(rk)
1886~/pandas/pandas/_libs/hashtable.pyxinpandas._libs.hashtable.Int64Factorizer.factorize()
109returnself.count110-->111deffactorize(self, int64_t[:] values, sort=False,
112na_sentinel=-1, na_value=None):
113 """
~/pandas/pandas/_libs/hashtable.cpython-37m-darwin.so in View.MemoryView.memoryview_cwrapper()
~/pandas/pandas/_libs/hashtable.cpython-37m-darwin.so in View.MemoryView.memoryview.__cinit__()
ValueError: buffer source array is read-only
Expected Output
d1.copy(deep=True).merge(d2.copy(deep=True)) could give the correct result:
In [10]: d1.copy(deep=True).merge(d2.copy(deep=True))
Out[10]:
abcdexy01234534
Output of pd.show_versions()
I'm working with pandas master so show_version() doesn't work. The git commit hash is a818281a45f7b5bd24f050e5d6868894c5108db6 (the latest version on master branch at 2019-08-16).
The text was updated successfully, but these errors were encountered:
Code Sample, a copy-pastable example if possible
Problem description
The above code raises an exception:
Expected Output
d1.copy(deep=True).merge(d2.copy(deep=True))
could give the correct result:Output of
pd.show_versions()
I'm working with pandas master so
show_version()
doesn't work. The git commit hash isa818281a45f7b5bd24f050e5d6868894c5108db6
(the latest version on master branch at 2019-08-16).The text was updated successfully, but these errors were encountered: