Skip to content

Commit 86684ad

Browse files
shelvinskyijorisvandenbossche
authored andcommitted
DOC: update the DataFrame.to_hdf() docstirng (#20186)
1 parent 6621cb6 commit 86684ad

File tree

1 file changed

+77
-29
lines changed

1 file changed

+77
-29
lines changed

pandas/core/generic.py

+77-29
Original file line numberDiff line numberDiff line change
@@ -1889,40 +1889,50 @@ def to_json(self, path_or_buf=None, orient=None, date_format=None,
18891889
index=index)
18901890

18911891
def to_hdf(self, path_or_buf, key, **kwargs):
1892-
"""Write the contained data to an HDF5 file using HDFStore.
1892+
"""
1893+
Write the contained data to an HDF5 file using HDFStore.
1894+
1895+
Hierarchical Data Format (HDF) is self-describing, allowing an
1896+
application to interpret the structure and contents of a file with
1897+
no outside information. One HDF file can hold a mix of related objects
1898+
which can be accessed as a group or as individual objects.
1899+
1900+
In order to add another DataFrame or Series to an existing HDF file
1901+
please use append mode and a different a key.
1902+
1903+
For more information see the :ref:`user guide <io.html#io-hdf5>`.
18931904
18941905
Parameters
18951906
----------
1896-
path_or_buf : the path (string) or HDFStore object
1897-
key : string
1898-
identifier for the group in the store
1899-
mode : optional, {'a', 'w', 'r+'}, default 'a'
1900-
1901-
``'w'``
1902-
Write; a new file is created (an existing file with the same
1903-
name would be deleted).
1904-
``'a'``
1905-
Append; an existing file is opened for reading and writing,
1906-
and if the file does not exist it is created.
1907-
``'r+'``
1908-
It is similar to ``'a'``, but the file must already exist.
1909-
format : 'fixed(f)|table(t)', default is 'fixed'
1910-
fixed(f) : Fixed format
1911-
Fast writing/reading. Not-appendable, nor searchable
1912-
table(t) : Table format
1913-
Write as a PyTables Table structure which may perform
1914-
worse but allow more flexible operations like searching
1915-
/ selecting subsets of the data
1916-
append : boolean, default False
1917-
For Table formats, append the input data to the existing
1918-
data_columns : list of columns, or True, default None
1907+
path_or_buf : str or pandas.HDFStore
1908+
File path or HDFStore object.
1909+
key : str
1910+
Identifier for the group in the store.
1911+
mode : {'a', 'w', 'r+'}, default 'a'
1912+
Mode to open file:
1913+
1914+
- 'w': write, a new file is created (an existing file with
1915+
the same name would be deleted).
1916+
- 'a': append, an existing file is opened for reading and
1917+
writing, and if the file does not exist it is created.
1918+
- 'r+': similar to 'a', but the file must already exist.
1919+
format : {'fixed', 'table'}, default 'fixed'
1920+
Possible values:
1921+
1922+
- 'fixed': Fixed format. Fast writing/reading. Not-appendable,
1923+
nor searchable.
1924+
- 'table': Table format. Write as a PyTables Table structure
1925+
which may perform worse but allow more flexible operations
1926+
like searching / selecting subsets of the data.
1927+
append : bool, default False
1928+
For Table formats, append the input data to the existing.
1929+
data_columns : list of columns or True, optional
19191930
List of columns to create as indexed data columns for on-disk
19201931
queries, or True to use all columns. By default only the axes
19211932
of the object are indexed. See `here
19221933
<http://pandas.pydata.org/pandas-docs/stable/io.html#query-via-data-columns>`__.
1923-
19241934
Applicable only to format='table'.
1925-
complevel : int, 0-9, default None
1935+
complevel : {0-9}, optional
19261936
Specifies a compression level for data.
19271937
A value of 0 disables compression.
19281938
complib : {'zlib', 'lzo', 'bzip2', 'blosc'}, default 'zlib'
@@ -1934,11 +1944,49 @@ def to_hdf(self, path_or_buf, key, **kwargs):
19341944
Specifying a compression library which is not available issues
19351945
a ValueError.
19361946
fletcher32 : bool, default False
1937-
If applying compression use the fletcher32 checksum
1938-
dropna : boolean, default False.
1947+
If applying compression use the fletcher32 checksum.
1948+
dropna : bool, default False
19391949
If true, ALL nan rows will not be written to store.
1940-
"""
19411950
1951+
See Also
1952+
--------
1953+
DataFrame.read_hdf : Read from HDF file.
1954+
DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
1955+
DataFrame.to_sql : Write to a sql table.
1956+
DataFrame.to_feather : Write out feather-format for DataFrames.
1957+
DataFrame.to_csv : Write out to a csv file.
1958+
1959+
Examples
1960+
--------
1961+
>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]},
1962+
... index=['a', 'b', 'c'])
1963+
>>> df.to_hdf('data.h5', key='df', mode='w')
1964+
1965+
We can add another object to the same file:
1966+
1967+
>>> s = pd.Series([1, 2, 3, 4])
1968+
>>> s.to_hdf('data.h5', key='s')
1969+
1970+
Reading from HDF file:
1971+
1972+
>>> pd.read_hdf('data.h5', 'df')
1973+
A B
1974+
a 1 4
1975+
b 2 5
1976+
c 3 6
1977+
>>> pd.read_hdf('data.h5', 's')
1978+
0 1
1979+
1 2
1980+
2 3
1981+
3 4
1982+
dtype: int64
1983+
1984+
Deleting file with data:
1985+
1986+
>>> import os
1987+
>>> os.remove('data.h5')
1988+
1989+
"""
19421990
from pandas.io import pytables
19431991
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
19441992

0 commit comments

Comments
 (0)