Skip to content

API: change nomeclature in HDFStore to use format=fixed(f) | table(t) #4715

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 31, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 20 additions & 12 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1794,27 +1794,31 @@ similar to how ``read_csv`` and ``to_csv`` work. (new in 0.11.0)

os.remove('store_tl.h5')

.. _io.hdf5-storer:
.. _io.hdf5-fixed:

Storer Format
~~~~~~~~~~~~~
Fixed Format
~~~~~~~~~~~~

.. note::

This was prior to 0.13.0 the ``Storer`` format.

The examples above show storing using ``put``, which write the HDF5 to ``PyTables`` in a fixed array format, called
the ``storer`` format. These types of stores are are **not** appendable once written (though you can simply
the ``fixed`` format. These types of stores are are **not** appendable once written (though you can simply
remove them and rewrite). Nor are they **queryable**; they must be
retrieved in their entirety. These offer very fast writing and slightly faster reading than ``table`` stores.
This format is specified by default when using ``put`` or by ``fmt='s'``
This format is specified by default when using ``put`` or ``to_hdf`` or by ``format='fixed'`` or ``format='f'``

.. warning::

A ``storer`` format will raise a ``TypeError`` if you try to retrieve using a ``where`` .
A ``fixed`` format will raise a ``TypeError`` if you try to retrieve using a ``where`` .

.. code-block:: python

DataFrame(randn(10,2)).to_hdf('test_storer.h5','df')
DataFrame(randn(10,2)).to_hdf('test_fixed.h5','df')

pd.read_hdf('test_storer.h5','df',where='index>5')
TypeError: cannot pass a where specification when reading a non-table
pd.read_hdf('test_fixed.h5','df',where='index>5')
TypeError: cannot pass a where specification when reading a fixed format.
this store must be selected in its entirety


Expand All @@ -1827,7 +1831,11 @@ Table Format
format. Conceptually a ``table`` is shaped very much like a DataFrame,
with rows and columns. A ``table`` may be appended to in the same or
other sessions. In addition, delete & query type operations are
supported. This format is specified by ``fmt='t'`` to ``append`` or ``put``.
supported. This format is specified by ``format='table'`` or ``format='t'``
to ``append`` or ``put`` or ``to_hdf``

This format can be set as an option as well ``pd.set_option('io.hdf.default_format','table')`` to
enable ``put/append/to_hdf`` to by default store in the ``table`` format.

.. ipython:: python
:suppress:
Expand All @@ -1854,7 +1862,7 @@ supported. This format is specified by ``fmt='t'`` to ``append`` or ``put``.

.. note::

You can also create a ``table`` by passing ``fmt='t'`` to a ``put`` operation.
You can also create a ``table`` by passing ``format='table'`` or ``format='t'`` to a ``put`` operation.

.. _io.hdf5-keys:

Expand Down Expand Up @@ -2363,7 +2371,7 @@ Starting in 0.11, passing a ``min_itemsize`` dict will cause all passed columns
External Compatibility
~~~~~~~~~~~~~~~~~~~~~~

``HDFStore`` write storer objects in specific formats suitable for
``HDFStore`` write ``table`` format objects in specific formats suitable for
producing loss-less roundtrips to pandas objects. For external
compatibility, ``HDFStore`` can read native ``PyTables`` format
tables. It is possible to write an ``HDFStore`` object that can easily
Expand Down
3 changes: 2 additions & 1 deletion doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,10 +108,11 @@ pandas 0.13
- removed the ``warn`` argument from ``open``. Instead a ``PossibleDataLossError`` exception will
be raised if you try to use ``mode='w'`` with an OPEN file handle (:issue:`4367`)
- allow a passed locations array or mask as a ``where`` condition (:issue:`4467`)
- the ``fmt`` keyword now replaces the ``table`` keyword; allowed values are ``s|t``
- add the keyword ``dropna=True`` to ``append`` to change whether ALL nan rows are not written
to the store (default is ``True``, ALL nan rows are NOT written), also settable
via the option ``io.hdf.dropna_table`` (:issue:`4625`)
- the ``format`` keyword now replaces the ``table`` keyword; allowed values are ``fixed(f)|table(t)``
the ``Storer`` format has been renamed to ``Fixed``
- ``JSON``

- added ``date_unit`` parameter to specify resolution of timestamps. Options
Expand Down
10 changes: 5 additions & 5 deletions doc/source/v0.13.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -79,17 +79,17 @@ API changes
- allow a passed locations array or mask as a ``where`` condition (:issue:`4467`).
See :ref:`here<io.hdf5-where_mask>` for an example.

- the ``fmt`` keyword now replaces the ``table`` keyword; allowed values are ``s|t``
the same defaults as prior < 0.13.0 remain, e.g. ``put`` implies 's' (Storer) format
and ``append`` imples 't' (Table) format
- the ``format`` keyword now replaces the ``table`` keyword; allowed values are ``fixed(f)`` or ``table(t)``
the same defaults as prior < 0.13.0 remain, e.g. ``put`` implies 'fixed` or 'f' (Fixed) format
and ``append`` imples 'table' or 't' (Table) format

.. ipython:: python

path = 'test.h5'
df = DataFrame(randn(10,2))
df.to_hdf(path,'df_table',fmt='t')
df.to_hdf(path,'df_table',format='table')
df.to_hdf(path,'df_table2',append=True)
df.to_hdf(path,'df_storer')
df.to_hdf(path,'df_fixed')
with get_store(path) as store:
print store

Expand Down
9 changes: 9 additions & 0 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -678,6 +678,15 @@ def to_hdf(self, path_or_buf, key, **kwargs):
and if the file does not exist it is created.
``'r+'``
It is similar to ``'a'``, but the file must already exist.
format : 'fixed(f)|table(t)', default is 'fixed'
fixed(f) : Fixed format
Fast writing/reading. Not-appendable, nor searchable
table(t) : Table format
Write as a PyTables Table structure which may perform worse but
allow more flexible operations like searching / selecting subsets
of the data
append : boolean, default False
For Table formats, append the input data to the existing
complevel : int, 1-9, default 0
If a complib is specified compression will be applied
where possible
Expand Down
Loading