-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the to_pickle
& read_pickle
docstring
#20253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 25 commits
3d6ed4a
62a202f
03fa85b
3169811
d014e97
202f411
05e1bce
f674845
556adf4
42fcc03
e69ea5c
f36c6dd
5f152ed
c6231b0
0c3a442
709ca74
c15d454
b3d9cee
ef19c93
33a9b1f
7be8f3b
d69c73f
e2af5a3
46b7342
7f1d3d4
3e545f3
c1d6f03
39969d5
c1a9d57
26b3e2e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1906,23 +1906,54 @@ def to_pickle(self, path, compression='infer', | |
Parameters | ||
---------- | ||
path : string | ||
File path | ||
File path where the pickled object will be stored. | ||
compression : {'infer', 'gzip', 'bz2', 'xz', None}, default 'infer' | ||
a string representing the compression to use in the output file | ||
A string representing the compression to use in the output file. By | ||
default, infers from the specified path. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "infers from the specified file extension" may be? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes pythonista. |
||
|
||
.. versionadded:: 0.20.0 | ||
protocol : int | ||
Int which indicates which protocol should be used by the pickler, | ||
default HIGHEST_PROTOCOL (see [1], paragraph 12.1.2). The possible | ||
values for this parameter depend on the version of Python. For | ||
Python 2.x, possible values are 0, 1, 2. For Python>=3.0, 3 is a | ||
valid value. For Python >= 3.4, 4 is a valid value.A negative value | ||
for the protocol parameter is equivalent to setting its value to | ||
HIGHEST_PROTOCOL. | ||
valid value. For Python >= 3.4, 4 is a valid value. A negative | ||
value for the protocol parameter is equivalent to setting its value | ||
to HIGHEST_PROTOCOL. | ||
|
||
.. [1] https://docs.python.org/3/library/pickle.html | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think is more standard to have this in a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche your opinion? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I previously said it was fine, because we don't use |
||
.. versionadded:: 0.21.0 | ||
|
||
Examples | ||
-------- | ||
>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) | ||
>>> original_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
>>> original_df.to_pickle("./dummy.pkl") | ||
|
||
>>> unpickled_df = pd.read_pickle("./dummy.pkl") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche how handling file paths in doc-strings? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great, see that now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks guys for looking into it. |
||
>>> unpickled_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add here a
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added. :) |
||
See Also | ||
-------- | ||
read_pickle : Load pickled pandas object (or any other pickled object) | ||
from the specified file path. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you only indent this line with 4 spaces? Like
Both work for sphinx, but we mainly use this pattern, so it's better to be consistent in this. (same for the other ones below) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done. |
||
DataFrame.to_hdf : Write the contained data to an HDF5 file using | ||
HDFStore. | ||
DataFrame.to_sql : Write records stored in a DataFrame to a SQL | ||
database. | ||
DataFrame.to_parquet : Write a DataFrame to the binary parquet format. | ||
""" | ||
from pandas.io.pickle import to_pickle | ||
return to_pickle(self, path, compression=compression, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,15 +10,17 @@ | |
|
||
def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | ||
""" | ||
Pickle (serialize) object to input file path | ||
Pickle (serialize) object to input file path. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find it a bit strange to say we pickle the object the a "file path", it's actually to a file (located at the file path), so maybe simplify to "Pickle (serialize) object to file." ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. amended 👍 |
||
|
||
Parameters | ||
---------- | ||
obj : any object | ||
Any python object. | ||
path : string | ||
File path | ||
File path where the pickled object will be stored. | ||
compression : {'infer', 'gzip', 'bz2', 'xz', None}, default 'infer' | ||
a string representing the compression to use in the output file | ||
A string representing the compression to use in the output file. By | ||
default, infers from the specified path. | ||
|
||
.. versionadded:: 0.20.0 | ||
protocol : int | ||
|
@@ -33,7 +35,34 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | |
.. [1] https://docs.python.org/3/library/pickle.html | ||
.. versionadded:: 0.21.0 | ||
|
||
|
||
Examples | ||
-------- | ||
>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) | ||
>>> original_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a See Also and ref read_pickle (to_hdf, to_sql, to_parquet) also good There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added 💯 |
||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
>>> pd.to_pickle(original_df, "./dummy.pkl") | ||
|
||
>>> unpickled_df = pd.read_pickle("./dummy.pkl") | ||
>>> unpickled_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
|
||
See Also | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. corrected. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you push your latest changes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just because you mentioned you corrected it, but the see also is still after the examples? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh sorry, just missed this one and did others. |
||
-------- | ||
read_pickle : Load pickled pandas object (or any other pickled object) from | ||
the specified file path. | ||
DataFrame.to_hdf : Write the contained data to an HDF5 file using HDFStore. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would make this a bit simpler (eg no need to mention HDFStore): "Write DataFame to HDF5 file" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. amended 👍 |
||
DataFrame.to_sql : Write records stored in a DataFrame to a SQL database. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be a bit more consistent in the wording here, maybe just "Write DataFrame to a SQL database" ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. amended 👍 |
||
DataFrame.to_parquet : Write a DataFrame to the binary parquet format. | ||
""" | ||
path = _stringify_path(path) | ||
inferred_compression = _infer_compression(path, compression) | ||
|
@@ -52,15 +81,17 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | |
def read_pickle(path, compression='infer'): | ||
""" | ||
Load pickled pandas object (or any other pickled object) from the specified | ||
file path | ||
file path. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you try to fit this on one line? Maybe the "or any other pickled object" is not needed in the summary line and can go in an extended summary, as the typical use case should be pandas objects. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shortened. |
||
|
||
.. warning:: | ||
|
||
Warning: Loading pickled data received from untrusted sources can be | ||
unsafe. See: https://docs.python.org/3/library/pickle.html | ||
Loading pickled data received from untrusted sources can be | ||
unsafe. See `here <https://docs.python.org/3/library/pickle.html>`__. | ||
|
||
Parameters | ||
---------- | ||
path : string | ||
File path | ||
File path where the pickled object will be loaded. | ||
compression : {'infer', 'gzip', 'bz2', 'xz', 'zip', None}, default 'infer' | ||
For on-the-fly decompression of on-disk data. If 'infer', then use | ||
gzip, bz2, xz or zip if path ends in '.gz', '.bz2', '.xz', | ||
|
@@ -72,6 +103,37 @@ def read_pickle(path, compression='infer'): | |
Returns | ||
------- | ||
unpickled : type of object stored in file | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a See Also and ref DataFrame.to_pickle, pd.read_hdf, pd.read_sql, pd.read_parquet There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added. |
||
|
||
Examples | ||
-------- | ||
>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) | ||
>>> original_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
>>> pd.to_pickle(original_df, "./dummy.pkl") | ||
|
||
>>> unpickled_df = pd.read_pickle("./dummy.pkl") | ||
>>> unpickled_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
|
||
See Also | ||
-------- | ||
DataFrame.to_pickle : Pickle (serialize) DataFrame object to input file | ||
path. | ||
Series.to_pickle : Pickle (serialize) Series object to input file path. | ||
read_hdf : read from the store, close it if we opened it. | ||
read_sql : Read SQL query or database table into a DataFrame. | ||
read_parquet : Load a parquet object from the file path, returning a | ||
DataFrame. | ||
""" | ||
path = _stringify_path(path) | ||
inferred_compression = _infer_compression(path, compression) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str
instead ofstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 👍