-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Serialize/deserialize a Categorical whose values are taken from an enum #25448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interesting proposal...couldn't you use our Likewise, when serializing, you could do a transform on the column before calling |
I have a not so simple cycle of serializing/deserializing different types of dataframes with many fields so complexity quickly increases. It would be nice to have this tackled automatically. I tried using a converter
but then I lose the dtype for the column as pandas warns it will use only the converter.
and this is bad. If you tell me where to look at I can even have a try myself. I tried having a look but the logic seems pretty complex between real_values, inferred ones etc... |
@teto : Sorry for taking so long to respond here. Could you provide a complete code sample for what you're describing? That would be very helpful! Also, if you are interested, you can search for |
This also affects |
I am not sure if I fully understand this issue here. Is it the same problem described here? Do you want to specify a columns dtype as an ordered Categorial while doing |
Code Sample, a copy-pastable example if possible
should run as standalone
which outputs
The value
ConnectionRoles.Server
became nan through the serialization/deserialization process:Problem description
I want to be able to serialize (to_csv) then read (read_csv) a CategoricalDType that takes its values from a python Enum (or IntEnum).
Actually the dtype I use in my project (contrary to the toy example) is:
I've search the tracker and the most relevant ones (but yet different) might be:
Expected Output
Output of
pd.show_versions()
I am using v0.23.4 with a patch from master to fix some bug.
[paste the output of
pd.show_versions()
here below this line]INSTALLED VERSIONS
commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.19.0
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8
pandas: 0+unknown
pytest: None
pip: 18.1
setuptools: 40.6.3
Cython: None
numpy: 1.16.0
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.12
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: None
lxml.etree: 4.2.6
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.14
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
The text was updated successfully, but these errors were encountered: