Skip to content

Commit d1fcb20

Browse files
committed
Raise exception on non-unique column index in to_hdf for fixed format.
Fixes pandas-dev#7761.
1 parent 0568ed7 commit d1fcb20

File tree

4 files changed

+17
-1
lines changed

4 files changed

+17
-1
lines changed

doc/source/io.rst

+2-1
Original file line numberDiff line numberDiff line change
@@ -2311,7 +2311,8 @@ Fixed Format
23112311
The examples above show storing using ``put``, which write the HDF5 to ``PyTables`` in a fixed array format, called
23122312
the ``fixed`` format. These types of stores are are **not** appendable once written (though you can simply
23132313
remove them and rewrite). Nor are they **queryable**; they must be
2314-
retrieved in their entirety. These offer very fast writing and slightly faster reading than ``table`` stores.
2314+
retrieved in their entirety. They also do not support dataframes with non-unique column names.
2315+
The ``fixed`` format stores offer very fast writing and slightly faster reading than ``table`` stores.
23152316
This format is specified by default when using ``put`` or ``to_hdf`` or by ``format='fixed'`` or ``format='f'``
23162317

23172318
.. warning::

doc/source/v0.15.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,7 @@ Bug Fixes
187187
- Bug in Series 0-division with a float and integer operand dtypes (:issue:`7785`)
188188
- Bug in ``Series.astype("unicode")`` not calling ``unicode`` on the values correctly (:issue:`7758`)
189189
- Bug in ``DataFrame.as_matrix()`` with mixed ``datetime64[ns]`` and ``timedelta64[ns]`` dtypes (:issue:`7778`)
190+
- Raise a ``ValueError`` in ``df.to_hdf`` if ``df`` has non-unique columns as the resulting file will be broken (:issue:`7761`)
190191

191192

192193

pandas/io/pytables.py

+3
Original file line numberDiff line numberDiff line change
@@ -2680,6 +2680,9 @@ def write(self, obj, **kwargs):
26802680

26812681
self.attrs.ndim = data.ndim
26822682
for i, ax in enumerate(data.axes):
2683+
if i == 0:
2684+
if not ax.is_unique:
2685+
raise ValueError("Columns index has to be unique for fixed format")
26832686
self.write_index('axis%d' % i, ax)
26842687

26852688
# Supporting mixed-type DataFrame objects...nontrivial

pandas/io/tests/test_pytables.py

+11
Original file line numberDiff line numberDiff line change
@@ -4370,6 +4370,17 @@ def test_categorical(self):
43704370
# FIXME: TypeError: cannot pass a where specification when reading from a Fixed format store. this store must be selected in its entirety
43714371
#result = store.select('df', where = ['index>2'])
43724372
#tm.assert_frame_equal(df[df.index>2],result)
4373+
4374+
def test_duplicate_column_name(self):
4375+
df = DataFrame(columns=["a", "a"], data=[[0, 0]])
4376+
4377+
with ensure_clean_path(self.path) as path:
4378+
self.assertRaises(ValueError, df.to_hdf, path, 'df', format='fixed')
4379+
4380+
df.to_hdf(path, 'df', format='table')
4381+
other = read_hdf(path, 'df')
4382+
tm.assert_frame_equal(df, other)
4383+
43734384

43744385
def _test_sort(obj):
43754386
if isinstance(obj, DataFrame):

0 commit comments

Comments
 (0)