Skip to content

ValueError (Buffer dtype mismatch) on the join with categorical data #18646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vfilimonov opened this issue Dec 5, 2017 · 4 comments
Closed
Labels
Categorical Categorical Data Type Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode Testing pandas testing functions or related to the test suite
Milestone

Comments

@vfilimonov
Copy link
Contributor

Code Sample, a copy-pastable example if possible

dfa = pd.Series(np.array([1, 2, 3, 4, 5], dtype=np.int64),
                index=pd.Int64Index([1,2,3,4,5], name='IND'), name='PN').to_frame()

dfb = pd.Series([1,1,3,1,3], dtype='category',
                index=pd.Int64Index([1, 2, 7, 8, 9], name='PN'), name='CAT').to_frame()

dfa.join(dfb, on='PN')

Problem description

The code above (join with missing values) raises ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long', which is apparently ignored, however the result is correct:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long'

Exception ValueError: "Buffer dtype mismatch, expected 'Python object' but got 'long long'" in 'pandas._libs.lib.is_bool_array' ignored

     PN  CAT
IND         
1     1  1.0
2     2  1.0
3     3  NaN
4     4  NaN
5     5  NaN

p.s. perhaps this relates to #17187, though I'm not sure

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.21.0
pytest: 3.0.5
pip: 9.0.1
setuptools: 36.6.0
Cython: 0.25.2
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: 3.6.0
bs4: 4.5.3
html5lib: 1.0b10
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.5.0

@gfyoung gfyoung added Categorical Categorical Data Type Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 5, 2017
@jreback
Copy link
Contributor

jreback commented Dec 6, 2017

this is a dupe of #18064 already fixed in master by: #18252

I suppose could add a test (though since the exception was originally suppressed anyhow), not sure how we could 'test' this.

@jreback jreback added the Duplicate Report Duplicate issue or pull request label Dec 6, 2017
@jreback jreback added this to the 0.22.0 milestone Dec 6, 2017
@jreback jreback added the Testing pandas testing functions or related to the test suite label Dec 6, 2017
@vfilimonov
Copy link
Contributor Author

vfilimonov commented Dec 6, 2017 via email

@jreback jreback closed this as completed Dec 6, 2017
@josesho
Copy link

josesho commented Jan 5, 2018

In pandas 0.22.0, running the code sample still throws up the error, with the added line that the exception is ignored. (but the result of the join can be saved and used further on.)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'

Exception ignored in: 'pandas._libs.lib.is_bool_array'
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'

@jreback
Copy link
Contributor

jreback commented Jan 5, 2018

this is already fixed in master which will be 0.23.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode Testing pandas testing functions or related to the test suite
Projects
None yet
Development

No branches or pull requests

4 participants