Skip to content

Raise exception on non-unique column index in to_hdf for fixed format. #7788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 21, 2014

Conversation

filmor
Copy link
Contributor

@filmor filmor commented Jul 18, 2014

Fixes #7761.

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

hmm, maybe make this a NotImplementedError (as its not invalid, just not supported). Also maybe a note in the io.rst (in the Fixed section)

@jreback jreback added this to the 0.15.0 milestone Jul 18, 2014
@filmor
Copy link
Contributor Author

filmor commented Jul 18, 2014

Well, with the current fixed format it is in fact invalid. There is no way to get this right without changing block_items to point on columns using integers, i.e. their position in the zeroth axis.

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

it could be implemented (it just isn't). You are preventing the writing (which invalidates the reading) either way. The exception will be more informative (ValueError is saying something about the input is not valid. It IS valid, just not in writing :).

Actually you might want to have a nice exception message saying that you CAN store this in 'table' format.

@@ -2680,6 +2680,9 @@ def write(self, obj, **kwargs):

self.attrs.ndim = data.ndim
for i, ax in enumerate(data.axes):
if i == 0:
if not ax.is_unique:
raise ValueError("Columns index has to be unique for fixed format")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say, you can store this in table format, however.

@filmor
Copy link
Contributor Author

filmor commented Jul 21, 2014

It could not be implemented because block_items contains only strings. You'd have to change the fixed format to implement this, for all other cases (i.e. append and compression) a ValueError is raised.

If this is the only thing keeping you from merging I'll change it to NotImplementedError ;)

jreback added a commit that referenced this pull request Jul 21, 2014
Raise exception on non-unique column index in to_hdf for fixed format.
@jreback jreback merged commit 47ba06e into pandas-dev:master Jul 21, 2014
@jreback
Copy link
Contributor

jreback commented Jul 21, 2014

@filmor thanks for this!

@jreback
Copy link
Contributor

jreback commented Jul 21, 2014

this is technically for a non-unique info_axis (e.g. columns for df, but 0 axis for Series/Panel). Can you add some tests for this? And update the note in v0.15.0 (I moved it a bit)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-unique column names in fixed format HDF5
2 participants