-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ERR: raise on invalid coulmns using a fixed HDFStore #13492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
not really sure what you are doing. pls show an exact reproduction.
|
I am using the HDFStore interface. With your code snippet, please try and reset_index() on the returned frame, when the format="fixed"
|
I c. Well that's not really supported; you must have strings for column names. We did a fix for tables IIRC. want to do a pull-request? |
The index name is string in the source data frame. Storing it to hdf5 and retrieving it back is when the type changes to numpy.string_. I have not done a pull request before. This would be my first. Will give it a shot. |
fixed is not very respectful of attributes like this |
Hi! I'm at the sprints at pycon and am looking to pick this up! Managed to reproduce the issue even though for the type I get: In [27]: type(s1.index.name)
Out[27]: numpy.str_ instead of Same issue arises when reading the table with In terms of expected behavior, I'm not entirely certain what we want here - should we be casting the |
Also can confirm that this doesn't happen with |
@makmanalp yeah, I think the best thing to do would be to cast |
On my python3 installation, I'm finding that |
Ugh that's unfortunate. I guess we should know the encoding inside the HDF reader. |
Single-file example for easy reproduction: import pandas as pd
import numpy as np
import datetime
idx = pd.Index(pd.to_datetime([datetime.date(2000, 1, 1), datetime.date(2000, 1, 2)]), name='cols')
idx1 = pd.Index(pd.to_datetime([datetime.date(2010, 1, 1), datetime.date(2010, 1, 2)]), name='rows')
s = pd.DataFrame(np.arange(4).reshape(2,2), columns=idx, index=idx1)
with pd.HDFStore("test.h5", "w") as store:
store.put("test", s, "fixed")
with pd.HDFStore("test.h5", "r") as store:
s1 = store["test"]
# s1.reset_index() |
So, I just made a PR, it's just a first stab at the issue but hopefully it's in the right direction! Please let me know how happy you are with this fix and what I can do to get it release-ready! |
…ndas-dev#16444) * BUG: Handle numpy strings in index names in HDF5 pandas-dev#13492 * REF: refactor to _ensure_str (cherry picked from commit 18c316b)
…ndas-dev#16444) * BUG: Handle numpy strings in index names in HDF5 pandas-dev#13492 * REF: refactor to _ensure_str
…ndas-dev#16444) * BUG: Handle numpy strings in index names in HDF5 pandas-dev#13492 * REF: refactor to _ensure_str
Version 0.20.2 * tag 'v0.20.2': (68 commits) RLS: v0.20.2 DOC: Update release.rst DOC: Whatsnew fixups (pandas-dev#16596) ERRR: Raise error in usecols when column doesn't exist but length matches (pandas-dev#16460) BUG: convert numpy strings in index names in HDF pandas-dev#13492 (pandas-dev#16444) PERF: vectorize _interp_limit (pandas-dev#16592) DOC: whatsnew 0.20.2 edits (pandas-dev#16587) API: Make is_strictly_monotonic_* private (pandas-dev#16576) BUG: reimplement MultiIndex.remove_unused_levels (pandas-dev#16565) Strictly monotonic (pandas-dev#16555) ENH: add .ngroup() method to groupby objects (pandas-dev#14026) (pandas-dev#14026) fix linting BUG: Incorrect handling of rolling.cov with offset window (pandas-dev#16244) BUG: select_as_multiple doesn't respect start/stop kwargs GH16209 (pandas-dev#16317) return empty MultiIndex for symmetrical difference on equal MultiIndexes (pandas-dev#16486) BUG: Bug in .resample() and .groupby() when aggregating on integers (pandas-dev#16549) BUG: Fixed tput output on windows (pandas-dev#16496) Strictly monotonic (pandas-dev#16555) BUG: fixed wrong order of ordered labels in pd.cut() BUG: Fixed to_html ignoring index_names parameter ...
Code Sample
Expected Output
output of
pd.show_versions()
# Problem occurs in 0.16.2 and 0.18.1
The text was updated successfully, but these errors were encountered: