-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the to_pickle
& read_pickle
docstring
#20253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
3d6ed4a
62a202f
03fa85b
3169811
d014e97
202f411
05e1bce
f674845
556adf4
42fcc03
e69ea5c
f36c6dd
5f152ed
c6231b0
0c3a442
709ca74
c15d454
b3d9cee
ef19c93
33a9b1f
7be8f3b
d69c73f
e2af5a3
46b7342
7f1d3d4
3e545f3
c1d6f03
39969d5
c1a9d57
26b3e2e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1901,28 +1901,59 @@ def to_sql(self, name, con, schema=None, if_exists='fail', index=True, | |
def to_pickle(self, path, compression='infer', | ||
protocol=pkl.HIGHEST_PROTOCOL): | ||
""" | ||
Pickle (serialize) object to input file path. | ||
Pickle (serialize) object to file. | ||
|
||
Parameters | ||
---------- | ||
path : string | ||
File path | ||
path : str | ||
File path where the pickled object will be stored. | ||
compression : {'infer', 'gzip', 'bz2', 'xz', None}, default 'infer' | ||
a string representing the compression to use in the output file | ||
A string representing the compression to use in the output file. By | ||
default, infers from the file extension in specified path. | ||
|
||
.. versionadded:: 0.20.0 | ||
protocol : int | ||
Int which indicates which protocol should be used by the pickler, | ||
default HIGHEST_PROTOCOL (see [1], paragraph 12.1.2). The possible | ||
values for this parameter depend on the version of Python. For | ||
Python 2.x, possible values are 0, 1, 2. For Python>=3.0, 3 is a | ||
valid value. For Python >= 3.4, 4 is a valid value.A negative value | ||
for the protocol parameter is equivalent to setting its value to | ||
HIGHEST_PROTOCOL. | ||
valid value. For Python >= 3.4, 4 is a valid value. A negative | ||
value for the protocol parameter is equivalent to setting its value | ||
to HIGHEST_PROTOCOL. | ||
|
||
.. [1] https://docs.python.org/3/library/pickle.html | ||
.. versionadded:: 0.21.0 | ||
|
||
See Also | ||
-------- | ||
read_pickle : Load pickled pandas object (or any object) from file. | ||
DataFrame.to_hdf : Write DataFrame to an HDF5 file. | ||
DataFrame.to_sql : Write DataFrame to a SQL database. | ||
DataFrame.to_parquet : Write a DataFrame to the binary parquet format. | ||
|
||
Examples | ||
-------- | ||
>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) | ||
>>> original_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
>>> original_df.to_pickle("./dummy.pkl") | ||
|
||
>>> unpickled_df = pd.read_pickle("./dummy.pkl") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche how handling file paths in doc-strings? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great, see that now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks guys for looking into it. |
||
>>> unpickled_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add here a
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added. :) |
||
>>> import os | ||
>>> os.remove("./dummy.pkl") | ||
""" | ||
from pandas.io.pickle import to_pickle | ||
return to_pickle(self, path, compression=compression, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,15 +10,17 @@ | |
|
||
def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | ||
""" | ||
Pickle (serialize) object to input file path | ||
Pickle (serialize) object to file. | ||
|
||
Parameters | ||
---------- | ||
obj : any object | ||
path : string | ||
File path | ||
Any python object. | ||
path : str | ||
File path where the pickled object will be stored. | ||
compression : {'infer', 'gzip', 'bz2', 'xz', None}, default 'infer' | ||
a string representing the compression to use in the output file | ||
A string representing the compression to use in the output file. By | ||
default, infers from the file extension in specified path. | ||
|
||
.. versionadded:: 0.20.0 | ||
protocol : int | ||
|
@@ -33,7 +35,36 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | |
.. [1] https://docs.python.org/3/library/pickle.html | ||
.. versionadded:: 0.21.0 | ||
|
||
|
||
See Also | ||
-------- | ||
read_pickle : Load pickled pandas object (or any object) from file. | ||
DataFrame.to_hdf : Write DataFrame to an HDF5 file. | ||
DataFrame.to_sql : Write DataFrame to a SQL database. | ||
DataFrame.to_parquet : Write a DataFrame to the binary parquet format. | ||
|
||
Examples | ||
-------- | ||
>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) | ||
>>> original_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
>>> pd.to_pickle(original_df, "./dummy.pkl") | ||
|
||
>>> unpickled_df = pd.read_pickle("./dummy.pkl") | ||
>>> unpickled_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
|
||
>>> import os | ||
>>> os.remove("./dummy.pkl") | ||
""" | ||
path = _stringify_path(path) | ||
inferred_compression = _infer_compression(path, compression) | ||
|
@@ -51,16 +82,17 @@ def to_pickle(obj, path, compression='infer', protocol=pkl.HIGHEST_PROTOCOL): | |
|
||
def read_pickle(path, compression='infer'): | ||
""" | ||
Load pickled pandas object (or any other pickled object) from the specified | ||
file path | ||
Load pickled pandas object (or any object) from file. | ||
|
||
.. warning:: | ||
|
||
Warning: Loading pickled data received from untrusted sources can be | ||
unsafe. See: https://docs.python.org/3/library/pickle.html | ||
Loading pickled data received from untrusted sources can be | ||
unsafe. See `here <https://docs.python.org/3/library/pickle.html>`__. | ||
|
||
Parameters | ||
---------- | ||
path : string | ||
File path | ||
path : str | ||
File path where the pickled object will be loaded. | ||
compression : {'infer', 'gzip', 'bz2', 'xz', 'zip', None}, default 'infer' | ||
For on-the-fly decompression of on-disk data. If 'infer', then use | ||
gzip, bz2, xz or zip if path ends in '.gz', '.bz2', '.xz', | ||
|
@@ -72,6 +104,38 @@ def read_pickle(path, compression='infer'): | |
Returns | ||
------- | ||
unpickled : type of object stored in file | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a See Also and ref DataFrame.to_pickle, pd.read_hdf, pd.read_sql, pd.read_parquet There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added. |
||
|
||
See Also | ||
-------- | ||
DataFrame.to_pickle : Pickle (serialize) DataFrame object to file. | ||
Series.to_pickle : Pickle (serialize) Series object to file. | ||
read_hdf : Read HDF5 file into a DataFrame. | ||
read_sql : Read SQL query or database table into a DataFrame. | ||
read_parquet : Load a parquet object, returning a DataFrame. | ||
|
||
Examples | ||
-------- | ||
>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) | ||
>>> original_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
>>> pd.to_pickle(original_df, "./dummy.pkl") | ||
|
||
>>> unpickled_df = pd.read_pickle("./dummy.pkl") | ||
>>> unpickled_df | ||
foo bar | ||
0 0 5 | ||
1 1 6 | ||
2 2 7 | ||
3 3 8 | ||
4 4 9 | ||
|
||
>>> import os | ||
>>> os.remove("./dummy.pkl") | ||
""" | ||
path = _stringify_path(path) | ||
inferred_compression = _infer_compression(path, compression) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is more standard to have this in a
References
section.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I previously said it was fine, because we don't use
References
sections in many cases (most of the time we use inline links), another thing we can discuss in further improving the guidelines.