pandas-dev · wesm · Jan 20, 2013 · Jan 10, 2013 · Jan 14, 2013 · Jan 15, 2013
diff --git a/RELEASE.rst b/RELEASE.rst
@@ -52,6 +52,7 @@ pandas 0.10.1
     - added method ``unique`` to select the unique values in an indexable or data column
     - added method ``copy`` to copy an existing store (and possibly upgrade)
     - show the shape of the data on disk for non-table stores when printing the store
+    - added ability to read PyTables flavor tables (allows compatiblity to other HDF5 systems)
   - Add ``logx`` option to DataFrame/Series.plot (GH2327_, #2565)
   - Support reading gzipped data from file-like object
   - ``pivot_table`` aggfunc can be anything used in GroupBy.aggregate (GH2643_)
@@ -66,6 +67,8 @@ pandas 0.10.1
     - handle correctly ``Term`` passed types (e.g. ``index<1000``, when index
       is ``Int64``), (closes GH512_)
     - handle Timestamp correctly in data_columns (closes GH2637_)
+    - contains correctly matches on non-natural names
+    - correctly store ``float32`` dtypes in tables (if not other float types in the same table)
   - Fix DataFrame.info bug with UTF8-encoded columns. (GH2576_)
   - Fix DatetimeIndex handling of FixedOffset tz (GH2604_)
   - More robust detection of being in IPython session for wide DataFrame
@@ -86,6 +89,7 @@ pandas 0.10.1
     - refactored HFDStore to deal with non-table stores as objects, will allow future enhancements
     - removed keyword ``compression`` from ``put`` (replaced by keyword
       ``complib`` to be consistent across library)
+    - warn `PerformanceWarning` if you are attempting to store types that will be pickled by PyTables
 
 .. _GH512: https://github.com/pydata/pandas/issues/512
 .. _GH1277: https://github.com/pydata/pandas/issues/1277
@@ -98,6 +102,7 @@ pandas 0.10.1
 .. _GH2625: https://github.com/pydata/pandas/issues/2625
 .. _GH2643: https://github.com/pydata/pandas/issues/2643
 .. _GH2637: https://github.com/pydata/pandas/issues/2637
+.. _GH2694: https://github.com/pydata/pandas/issues/2694
 
 pandas 0.10.0
 =============

diff --git a/doc/source/io.rst b/doc/source/io.rst
@@ -1211,7 +1211,7 @@ You can create/modify an index for a table with ``create_table_index`` after dat
 
 Query via Data Columns
 ~~~~~~~~~~~~~~~~~~~~~~
-You can designate (and index) certain columns that you want to be able to perform queries (other than the `indexable` columns, which you can always query). For instance say you want to perform this common operation, on-disk, and return just the frame that matches this query.
+You can designate (and index) certain columns that you want to be able to perform queries (other than the `indexable` columns, which you can always query). For instance say you want to perform this common operation, on-disk, and return just the frame that matches this query. You can specify ``data_columns = True`` to force all columns to be data_columns
 
 .. ipython:: python
 
@@ -1260,7 +1260,7 @@ To retrieve the *unique* values of an indexable or data column, use the method `
 
    concat([ store.select('df_dc',c) for c in [ crit1, crit2 ] ])
 
-**Table Object**
+**Storer Object**
 
 If you want to inspect the stored object, retrieve via ``get_storer``. You could use this progamatically to say get the number of rows in an object.
 
@@ -1363,17 +1363,40 @@ Notes & Caveats
 	# we have provided a minimum minor_axis indexable size
 	store.root.wp_big_strings.table
 
-Compatibility
-~~~~~~~~~~~~~
+External Compatibility
+~~~~~~~~~~~~~~~~~~~~~~
+
+``HDFStore`` write storer objects in specific formats suitable for producing loss-less roundtrips to pandas objects. For external compatibility, ``HDFStore`` can read native ``PyTables`` format tables. It is possible to write an ``HDFStore`` object that can easily be imported into ``R`` using the ``rhdf5`` library. Create a table format store like this:
+
+     .. ipython:: python
+
+        store_export = HDFStore('export.h5')
+	store_export.append('df_dc',df_dc,data_columns=df_dc.columns)
+	store_export
+
+     .. ipython:: python
+        :suppress:
+
+        store_export.close()
+        import os
+        os.remove('export.h5')
+
+Backwards Compatibility
+~~~~~~~~~~~~~~~~~~~~~~~
 
 0.10.1 of ``HDFStore`` is backwards compatible for reading tables created in a prior version of pandas however, query terms using the prior (undocumented) methodology are unsupported. ``HDFStore`` will issue a warning if you try to use a prior-version format file. You must read in the entire file and write it out using the new format, using the method ``copy`` to take advantage of the updates. The group attribute ``pandas_version`` contains the version information. ``copy`` takes a number of options, please see the docstring.
 
 
+     .. ipython:: python
+        :suppress:
+
+        import os
+        legacy_file_path = os.path.abspath('source/_static/legacy_0.10.h5')
+
      .. ipython:: python
 
         # a legacy store
-	import os
-        legacy_store = HDFStore('legacy_0.10.h5', 'r')
+        legacy_store = HDFStore(legacy_file_path,'r')
         legacy_store
 
         # copy (and return the new handle)
@@ -1397,6 +1420,7 @@ Performance
    - You can pass ``chunksize=an integer`` to ``append``, to change the writing chunksize (default is 50000). This will signficantly lower your memory usage on writing.
    - You can pass ``expectedrows=an integer`` to the first ``append``, to set the TOTAL number of expectedrows that ``PyTables`` will expected. This will optimize read/write performance.
    - Duplicate rows can be written to tables, but are filtered out in selection (with the last items being selected; thus a table is unique on major, minor pairs)
+   - A ``PerformanceWarning`` will be raised if you are attempting to store types that will be pickled by PyTables (rather than stored as endemic types). See <http://stackoverflow.com/questions/14355151/how-to-make-pandas-hdfstore-put-operation-faster/14370190#14370190> for more information and some solutions.
 
 Experimental
 ~~~~~~~~~~~~

diff --git a/doc/source/v0.10.1.txt b/doc/source/v0.10.1.txt
@@ -119,12 +119,15 @@ Multi-table creation via ``append_to_multiple`` and selection via ``select_as_mu
 
 **Enhancements**
 
+- ``HDFStore`` now can read native PyTables table format tables
 - You can pass ``nan_rep = 'my_nan_rep'`` to append, to change the default nan representation on disk (which converts to/from `np.nan`), this defaults to `nan`.
 - You can pass ``index`` to ``append``. This defaults to ``True``. This will automagically create indicies on the *indexables* and *data columns* of the table
 - You can pass ``chunksize=an integer`` to ``append``, to change the writing chunksize (default is 50000). This will signficantly lower your memory usage on writing.
 - You can pass ``expectedrows=an integer`` to the first ``append``, to set the TOTAL number of expectedrows that ``PyTables`` will expected. This will optimize read/write performance.
 - ``Select`` now supports passing ``start`` and ``stop`` to provide selection space limiting in selection.
 
+**Bug Fixes**
+- ``HDFStore`` tables can now store ``float32`` types correctly (cannot be mixed with ``float64`` however)
 
 See the `full release notes
 <https://github.com/pydata/pandas/blob/master/RELEASE.rst>`__ or issue tracker

diff --git a/pandas/core/reshape.py b/pandas/core/reshape.py
@@ -835,4 +835,4 @@ def block2d_to_blocknd(values, items, shape, labels, ref_items=None):
 def factor_indexer(shape, labels):
     """ given a tuple of shape and a list of Factor lables, return the expanded label indexer """
     mult = np.array(shape)[::-1].cumprod()[::-1]
-    return np.sum(np.array(labels).T * np.append(mult, [1]), axis=1).T
+    return com._ensure_platform_int(np.sum(np.array(labels).T * np.append(mult, [1]), axis=1).T)