Skip to content

Commit 386d192

Browse files
committed
Merge pull request #4206 from jreback/hdf_doc
DOC: more prominent HDFStore store docs about storer/table formats
2 parents cb84398 + 02f32cc commit 386d192

File tree

3 files changed

+34
-13
lines changed

3 files changed

+34
-13
lines changed

doc/source/io.rst

+29-10
Original file line numberDiff line numberDiff line change
@@ -1651,11 +1651,6 @@ Closing a Store, Context Manager
16511651
import os
16521652
os.remove('store.h5')
16531653
1654-
1655-
These stores are **not** appendable once written (though you can simply
1656-
remove them and rewrite). Nor are they **queryable**; they must be
1657-
retrieved in their entirety.
1658-
16591654
Read/Write API
16601655
~~~~~~~~~~~~~~
16611656
@@ -1674,10 +1669,33 @@ similar to how ``read_csv`` and ``to_csv`` work. (new in 0.11.0)
16741669
16751670
os.remove('store_tl.h5')
16761671
1672+
.. _io.hdf5-storer:
1673+
1674+
Storer Format
1675+
~~~~~~~~~~~~~
1676+
1677+
The examples above show storing using ``put``, which write the HDF5 to ``PyTables`` in a fixed array format, called
1678+
the ``storer`` format. These types of stores are are **not** appendable once written (though you can simply
1679+
remove them and rewrite). Nor are they **queryable**; they must be
1680+
retrieved in their entirety. These offer very fast writing and slightly faster reading than ``table`` stores.
1681+
1682+
.. warning::
1683+
1684+
A ``storer`` format will raise a ``TypeError`` if you try to retrieve using a ``where`` .
1685+
1686+
.. code-block:: python
1687+
1688+
DataFrame(randn(10,2)).to_hdf('test_storer.h5','df')
1689+
1690+
pd.read_hdf('test_storer.h5','df',where='index>5')
1691+
TypeError: cannot pass a where specification when reading a non-table
1692+
this store must be selected in its entirety
1693+
1694+
16771695
.. _io.hdf5-table:
16781696
1679-
Storing in Table format
1680-
~~~~~~~~~~~~~~~~~~~~~~~
1697+
Table Format
1698+
~~~~~~~~~~~~
16811699
16821700
``HDFStore`` supports another ``PyTables`` format on disk, the ``table``
16831701
format. Conceptually a ``table`` is shaped very much like a DataFrame,
@@ -1708,6 +1726,10 @@ supported.
17081726
# the type of stored data
17091727
store.root.df._v_attrs.pandas_type
17101728
1729+
.. note::
1730+
1731+
You can also create a ``table`` by passing ``table=True`` to a ``put`` operation.
1732+
17111733
.. _io.hdf5-keys:
17121734
17131735
Hierarchical Keys
@@ -2121,9 +2143,6 @@ Notes & Caveats
21212143
in a string, or a ``NaT`` in a datetime-like column counts as having
21222144
a value), then those rows **WILL BE DROPPED IMPLICITLY**. This limitation
21232145
*may* be addressed in the future.
2124-
- You can not append/select/delete to a non-table (table creation is
2125-
determined on the first append, or by passing ``table=True`` in a
2126-
put operation)
21272146
- ``HDFStore`` is **not-threadsafe for writing**. The underlying
21282147
``PyTables`` only supports concurrent reads (via threading or
21292148
processes). If you need reading and writing *at the same time*, you

doc/source/release.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ pandas 0.12
113113
- When removing an object, ``remove(key)`` raises
114114
``KeyError`` if the key is not a valid store object.
115115
- raise a ``TypeError`` on passing ``where`` or ``columns``
116-
to select with a Storer; these are invalid parameters at this time
116+
to select with a Storer; these are invalid parameters at this time (:issue:`4189`)
117117
- can now specify an ``encoding`` option to ``append/put``
118118
to enable alternate encodings (:issue:`3750`)
119119
- enable support for ``iterator/chunksize`` with ``read_hdf``

pandas/io/pytables.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -1746,9 +1746,11 @@ def f(values, freq=None, tz=None):
17461746

17471747
def validate_read(self, kwargs):
17481748
if kwargs.get('columns') is not None:
1749-
raise TypeError("cannot pass a column specification when reading a Storer")
1749+
raise TypeError("cannot pass a column specification when reading a non-table "
1750+
"this store must be selected in its entirety")
17501751
if kwargs.get('where') is not None:
1751-
raise TypeError("cannot pass a where specification when reading a Storer")
1752+
raise TypeError("cannot pass a where specification when reading from a non-table "
1753+
"this store must be selected in its entirety")
17521754

17531755
@property
17541756
def is_exists(self):

0 commit comments

Comments
 (0)