-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: .unstack() with recarray column raises TypeError since 1.4.0 #49388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes. I can confirm this error. Also, just printing you original dataframe gives an error also: import numpy as np
import pandas as pd
c = np.array([2] * 20, dtype='f8')
r = np.rec.fromarrays([c], names=['c'])
df = pd.DataFrame({'a': np.arange(20) // 5, 'b': list('ABCDE') * 4, 'c': r})
print(df) gives error:
Want to take a stab at this, @stefan-jansen? |
i dont think we support record-dtypes. these should be cast somewhere along the way |
Ok, thanks, makes sense several errors pop up then, when using them:-). I see instantiating Series using recarrays gives an error:
The case with |
Looking further, we also can't use dataframes (a multidim object) as single columns in a dataframe: In [1]: df = pd.DataFrame({"a": "a b a b".split(), "b": range(4)})
In [2]: pd.DataFrame({"a": df}, index=df.index)
ValueError: Data must be 1-dimensional IMO we should disallow single columns being constructed from multidim object (like recarrays and DataFrames), in order to keep things consistent. @jbrockmendel, do you agree? |
It seems like the I couldn't find which change in 1.4.0 caused this change in behavior where |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
Starting with 1.4.0, Including a column of dtype
np.record
(related to #48526) as follows:raises a TypeError:
Expected Behavior
Before 1.4.0, the output of the same code shows that the
recarray
hasdtype
object
and theunstack
does not throw an error:Installed Versions
INSTALLED VERSIONS
commit : 91111fd
python : 3.9.13.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.0-52-generic
Version : #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.1
numpy : 1.23.4
pytz : 2022.5
dateutil : 2.8.2
setuptools : 65.5.0
pip : 22.3
Cython : 0.29.32
pytest : 6.2.5
hypothesis : None
sphinx : 5.3.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.5.0
pandas_datareader: 0.10.0
bs4 : 4.11.1
bottleneck : 1.3.5
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.6.1
numba : None
numexpr : 2.8.4
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.9.3
snappy : None
sqlalchemy : 1.4.42
tables : 3.7.0
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None
tzdata : None
The text was updated successfully, but these errors were encountered: