-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: update the DataFrame.to_hdf() docstirng #20186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
19bc38a
6d42dd1
7781bb5
429afbe
eebfc39
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1786,40 +1786,47 @@ def to_json(self, path_or_buf=None, orient=None, date_format=None, | |
index=index) | ||
|
||
def to_hdf(self, path_or_buf, key, **kwargs): | ||
"""Write the contained data to an HDF5 file using HDFStore. | ||
""" | ||
Write the contained data to an HDF5 file using HDFStore. | ||
|
||
Hierarchical Data Format (HDF) is self-describing, allowing an | ||
application to interpret the structure and contents of a file with | ||
no outside information. One HDF file can hold a mix of related objects | ||
which can be accessed as a group or as individual objects. | ||
|
||
In order to add another :class:`~pandas.DataFrame` or | ||
:class:`~pandas.Series` to an existing HDF file please use append mode | ||
and different a key. | ||
|
||
Parameters | ||
---------- | ||
path_or_buf : the path (string) or HDFStore object | ||
key : string | ||
identifier for the group in the store | ||
mode : optional, {'a', 'w', 'r+'}, default 'a' | ||
|
||
``'w'`` | ||
Write; a new file is created (an existing file with the same | ||
name would be deleted). | ||
``'a'`` | ||
Append; an existing file is opened for reading and writing, | ||
and if the file does not exist it is created. | ||
``'r+'`` | ||
It is similar to ``'a'``, but the file must already exist. | ||
format : 'fixed(f)|table(t)', default is 'fixed' | ||
fixed(f) : Fixed format | ||
Fast writing/reading. Not-appendable, nor searchable | ||
table(t) : Table format | ||
Write as a PyTables Table structure which may perform | ||
worse but allow more flexible operations like searching | ||
/ selecting subsets of the data | ||
path_or_buf : str or pandas.HDFStore | ||
File path or HDFStore object. | ||
key : str | ||
Identifier for the group in the store. | ||
mode : {'a', 'w', 'r+'}, default is 'a' | ||
Mode to open file: | ||
- ``'w'``: write, a new file is created (an existing file with | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's good to make this a list, but for sphinx , no indentation is needed (compared to "Mode .." on the line above), but, it needs a blank line between both lines (rst syntax details ...) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
the same name would be deleted). | ||
- ``'a'``: append, an existing file is opened for reading and | ||
writing, and if the file does not exist it is created. | ||
- `'r+'`: similar to ``'a'``, but the file must already exist. | ||
format : {'fixed', 'table'}, default is 'fixed' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "default is 'fixed' " -> "default 'fixed' " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
Possible values: | ||
- fixed: Fixed format. Fast writing/reading. Not-appendable, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here about indentation / blank line There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also, can you add single quotes around fixed (like (and same for table below) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
nor searchable. | ||
- table: Table format. Write as a PyTables Table structure | ||
which may perform worse but allow more flexible operations | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. another identation issue. Here, the "which ..." needs to align with "table: .." on the line above There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
like searching / selecting subsets of the data. | ||
append : boolean, default False | ||
For Table formats, append the input data to the existing | ||
data_columns : list of columns, or True, default None | ||
For Table formats, append the input data to the existing. | ||
data_columns : list of columns or True, optional | ||
List of columns to create as indexed data columns for on-disk | ||
queries, or True to use all columns. By default only the axes | ||
of the object are indexed. See `here | ||
<http://pandas.pydata.org/pandas-docs/stable/io.html#query-via-data-columns>`__. | ||
|
||
Applicable only to format='table'. | ||
complevel : int, 0-9, default None | ||
complevel : {0-9}, optional | ||
Specifies a compression level for data. | ||
A value of 0 disables compression. | ||
complib : {'zlib', 'lzo', 'bzip2', 'blosc'}, default 'zlib' | ||
|
@@ -1831,11 +1838,43 @@ def to_hdf(self, path_or_buf, key, **kwargs): | |
Specifying a compression library which is not available issues | ||
a ValueError. | ||
fletcher32 : bool, default False | ||
If applying compression use the fletcher32 checksum | ||
dropna : boolean, default False. | ||
If applying compression use the fletcher32 checksum. | ||
dropna : bool, default False | ||
If true, ALL nan rows will not be written to store. | ||
""" | ||
|
||
See Also | ||
-------- | ||
DataFrame.read_hdf : read from HDF file. | ||
DataFrame.to_parquet : write a DataFrame to the binary parquet format. | ||
DataFrame.to_sql : write to a sql table. | ||
DataFrame.to_feather : write out feather-format for DataFrames. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add DataFrame.to_parquet, read_hdf |
||
DataFrame.to_csv : write out to a csv file. | ||
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, | ||
... index=['a', 'b', 'c']) | ||
>>> df.to_hdf('data.h5', key='df', mode='w') | ||
|
||
We can append another object to the same file: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would use "add" here instead of "append", because "append" is also a keyword with a different behaviour (appending rows to the same table, not the same file) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
|
||
>>> s = pd.Series([1, 2, 3, 4]) | ||
>>> s.to_hdf('data.h5', key='s') | ||
|
||
Reading from HDF file: | ||
|
||
>>> pd.read_hdf('data.h5', 'df') | ||
A B | ||
a 1 4 | ||
b 2 5 | ||
c 3 6 | ||
>>> pd.read_hdf('data.h5', 's') | ||
0 1 | ||
1 2 | ||
2 3 | ||
3 4 | ||
dtype: int64 | ||
""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add in the end here a code block with
(so running the doctests does not leave behind files) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done Many thanks for you comments, they are really useful. |
||
from pandas.io import pytables | ||
return pytables.to_hdf(path_or_buf, key, self, **kwargs) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add here a link to the user guide (because there is a lot more information there). You can use something like
For more information see the :ref:`user guide <io.hdf5>`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint! This one is hard for me, because i'm not an expert in rst :( I put the line as you suggested, with the right subsection of that manual. when i generate html with 'make.py html' it doesnt' have a link. Is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is fine (when only building the docstring, the full user guide is not built, and therefore the link does not seem to work)