-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DEPR: msgpack #30112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: msgpack #30112
Changes from 15 commits
9e7b320
ae660e6
facab5a
64d817b
522723e
643cc5f
cc67a16
12b831c
eda359e
b4e1656
5654671
2f661b1
01086d2
6bd5f4c
08a5d7b
460540c
7d32540
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
This file was deleted.
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,7 +22,6 @@ Flat file | |
read_table | ||
read_csv | ||
read_fwf | ||
read_msgpack | ||
|
||
Clipboard | ||
~~~~~~~~~ | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -3382,87 +3382,19 @@ The default is to 'infer': | |||||
msgpack | ||||||
------- | ||||||
|
||||||
pandas supports the ``msgpack`` format for | ||||||
object serialization. This is a lightweight portable binary format, similar | ||||||
to binary JSON, that is highly space efficient, and provides good performance | ||||||
both on the writing (serialization), and reading (deserialization). | ||||||
pandas support for ``msgpack`` has been removed in version 1.0.0. It is recommended to use pyarrow for on-the-wire transmission of pandas objects. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we currently have a small code snippet in the doc-string of how to use pyarrow, can you copy it here as a code-block |
||||||
|
||||||
.. warning:: | ||||||
|
||||||
The msgpack format is deprecated as of 0.25 and will be removed in a future version. | ||||||
It is recommended to use pyarrow for on-the-wire transmission of pandas objects. | ||||||
|
||||||
.. warning:: | ||||||
|
||||||
:func:`read_msgpack` is only guaranteed backwards compatible back to pandas version 0.20.3 | ||||||
|
||||||
.. ipython:: python | ||||||
:okwarning: | ||||||
|
||||||
df = pd.DataFrame(np.random.rand(5, 2), columns=list('AB')) | ||||||
df.to_msgpack('foo.msg') | ||||||
pd.read_msgpack('foo.msg') | ||||||
s = pd.Series(np.random.rand(5), index=pd.date_range('20130101', periods=5)) | ||||||
|
||||||
You can pass a list of objects and you will receive them back on deserialization. | ||||||
|
||||||
.. ipython:: python | ||||||
:okwarning: | ||||||
|
||||||
pd.to_msgpack('foo.msg', df, 'foo', np.array([1, 2, 3]), s) | ||||||
pd.read_msgpack('foo.msg') | ||||||
|
||||||
You can pass ``iterator=True`` to iterate over the unpacked results: | ||||||
|
||||||
.. ipython:: python | ||||||
:okwarning: | ||||||
|
||||||
for o in pd.read_msgpack('foo.msg', iterator=True): | ||||||
print(o) | ||||||
|
||||||
You can pass ``append=True`` to the writer to append to an existing pack: | ||||||
|
||||||
.. ipython:: python | ||||||
:okwarning: | ||||||
Example pyarrow usage: | ||||||
|
||||||
df.to_msgpack('foo.msg', append=True) | ||||||
pd.read_msgpack('foo.msg') | ||||||
|
||||||
Unlike other io methods, ``to_msgpack`` is available on both a per-object basis, | ||||||
``df.to_msgpack()`` and using the top-level ``pd.to_msgpack(...)`` where you | ||||||
can pack arbitrary collections of Python lists, dicts, scalars, while intermixing | ||||||
pandas objects. | ||||||
|
||||||
.. ipython:: python | ||||||
:okwarning: | ||||||
|
||||||
pd.to_msgpack('foo2.msg', {'dict': [{'df': df}, {'string': 'foo'}, | ||||||
{'scalar': 1.}, {'s': s}]}) | ||||||
pd.read_msgpack('foo2.msg') | ||||||
|
||||||
.. ipython:: python | ||||||
:suppress: | ||||||
:okexcept: | ||||||
|
||||||
os.remove('foo.msg') | ||||||
os.remove('foo2.msg') | ||||||
|
||||||
Read/write API | ||||||
'''''''''''''' | ||||||
|
||||||
Msgpacks can also be read from and written to strings. | ||||||
|
||||||
.. ipython:: python | ||||||
:okwarning: | ||||||
|
||||||
df.to_msgpack() | ||||||
|
||||||
Furthermore you can concatenate the strings to produce a list of the original objects. | ||||||
.. code-block:: python | ||||||
|
||||||
.. ipython:: python | ||||||
:okwarning: | ||||||
>>> import pandas as pd | ||||||
>>> import pyarrow as pa | ||||||
>>> df = pd.DataFrame({'A': [1, 2, 3]}) | ||||||
>>> context = pa.default_serialization_context() | ||||||
>>> df_bytestring = context.serialize(df).to_buffer().to_pybytes() | ||||||
|
||||||
pd.read_msgpack(df.to_msgpack() + s.to_msgpack()) | ||||||
For documentation on pyarrow, see `here<https://arrow.apache.org/docs/python/index.html>`__. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i think the docbuild is complaining about this. help @jorisvandenbossche ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the double underscore is the issue There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The double underscore is correct, but I suspect the space that you corrected is the problem though There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. looks like that fixed it, thanks |
||||||
|
||||||
.. _io.hdf5: | ||||||
|
||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is deprecated/removed on a very short time frame, I think it might be good to keep this title a bit longer with a small note that it was deprecated/removed and how to replace it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? the whole point is to clean things for 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To provide documentation about this removal, and how to replace it.
For example, there have been issues opened with questions about this which lead to added examples in the docstrings, but this has never been published yet in actual documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k that’s fair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restored this section. LMK if you have suggestions for additional notes to put in here.
any thoughts on the LICENSE files mentioned in the OP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't restore the full section, but rather only keep a short explanation that it was deprecated/removed, and how to replace it (see the docstrings for some content for that)