Skip to content

Commit beacb39

Browse files
nrebenaproost
authored andcommitted
DOC: Reference level name as Term of HDFStore.select query (pandas-dev#28791) (pandas-dev#28793)
1 parent 683afb6 commit beacb39

File tree

2 files changed

+34
-1
lines changed

2 files changed

+34
-1
lines changed

doc/source/user_guide/io.rst

+33-1
Original file line numberDiff line numberDiff line change
@@ -3811,6 +3811,8 @@ storing/selecting from homogeneous index ``DataFrames``.
38113811
# the levels are automatically included as data columns
38123812
store.select('df_mi', 'foo=bar')
38133813
3814+
.. note::
3815+
The ``index`` keyword is reserved and cannot be use as a level name.
38143816

38153817
.. _io.hdf5-query:
38163818

@@ -3829,6 +3831,7 @@ A query is specified using the ``Term`` class under the hood, as a boolean expre
38293831

38303832
* ``index`` and ``columns`` are supported indexers of ``DataFrames``.
38313833
* if ``data_columns`` are specified, these can be used as additional indexers.
3834+
* level name in a MultiIndex, with default name ``level_0``, ``level_1``, … if not provided.
38323835

38333836
Valid comparison operators are:
38343837

@@ -3947,7 +3950,7 @@ space. These are in terms of the total number of rows in a table.
39473950

39483951
.. _io.hdf5-timedelta:
39493952

3950-
Using timedelta64[ns]
3953+
Query timedelta64[ns]
39513954
+++++++++++++++++++++
39523955

39533956
You can store and query using the ``timedelta64[ns]`` type. Terms can be
@@ -3966,6 +3969,35 @@ specified in the format: ``<float>(<unit>)``, where float may be signed (and fra
39663969
store.append('dftd', dftd, data_columns=True)
39673970
store.select('dftd', "C<'-3.5D'")
39683971
3972+
Query MultiIndex
3973+
++++++++++++++++
3974+
3975+
Selecting from a ``MultiIndex`` can be achieved by using the name of the level.
3976+
3977+
.. ipython:: python
3978+
3979+
df_mi.index.names
3980+
store.select('df_mi', "foo=baz and bar=two")
3981+
3982+
If the ``MultiIndex`` levels names are ``None``, the levels are automatically made available via
3983+
the ``level_n`` keyword with ``n`` the level of the ``MultiIndex`` you want to select from.
3984+
3985+
.. ipython:: python
3986+
3987+
index = pd.MultiIndex(
3988+
levels=[["foo", "bar", "baz", "qux"], ["one", "two", "three"]],
3989+
codes=[[0, 0, 0, 1, 1, 2, 2, 3, 3, 3], [0, 1, 2, 0, 1, 1, 2, 0, 1, 2]],
3990+
)
3991+
df_mi_2 = pd.DataFrame(np.random.randn(10, 3),
3992+
index=index, columns=["A", "B", "C"])
3993+
df_mi_2
3994+
3995+
store.append("df_mi_2", df_mi_2)
3996+
3997+
# the levels are automatically included as data columns with keyword level_n
3998+
store.select("df_mi_2", "level_0=foo and level_1=two")
3999+
4000+
39694001
Indexing
39704002
++++++++
39714003

doc/source/whatsnew/v1.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,7 @@ Documentation Improvements
162162
^^^^^^^^^^^^^^^^^^^^^^^^^^
163163

164164
- Added new section on :ref:`scale` (:issue:`28315`).
165+
- Added sub-section Query MultiIndex in IO tools user guide (:issue:`28791`)
165166

166167
.. _whatsnew_1000.deprecations:
167168

0 commit comments

Comments
 (0)