-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REGR: prefer user-provided mode #39440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -857,12 +857,15 @@ def file_exists(filepath_or_buffer: FilePathOrBuffer) -> bool: | |||
|
|||
def _is_binary_mode(handle: FilePathOrBuffer, mode: str) -> bool: | |||
"""Whether the handle is opened in binary mode""" | |||
# specified by user | |||
if "t" in mode or "b" in mode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is t here? is this tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"t" is used to explicitly request text-mode but it usually omitted. One of read_json
or to_json
is using it. I will add a test with a handle that needs that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk
@twoertwein ping when ready here (merge master as well) |
I assume that the occasional errors in |
yeah that makes sense. separate PR though |
thanks @twoertwein |
@meeseeksdev backport 1.2.x |
Something went wrong ... Please have a look at my logs. |
Co-authored-by: Torsten Wörtwein <[email protected]>
Perhaps I'm misusing the API, but this seems to have broken the ability to write gzipped JSON. I haven't fully walked the code path, but here is what I'm seeing: Reproduction: from io import BytesIO
import pandas as pd
dataframe = pd.DataFrame([1, 2, 3], columns=['a'])
object_stream = BytesIO()
dataframe.to_json(object_stream, compression='gzip', orient='records', lines=True) Result: if path_or_buf is not None:
# apply compression and byte/text conversion
with get_handle(
path_or_buf, "wt", compression=compression, storage_options=storage_options
) as handles:
> handles.handle.write(s)
E TypeError: a bytes-like object is required, not 'str'
../../.env/lib/python3.7/site-packages/pandas/io/json/_json.py:105: TypeError From what I can tell, _json's get_handle uses a static 'wt' mode which now, with the latest changes to common's _is_binary_mode, will always return false rather than check the path_or_buf instance type, thereby omitting the b flag on the mode passed to the rest of get_handle. Is this expected behavior/am I misusing the Pandas API? Not the reproduction code works with Pandas 1.2.1 |
follow-up to #39253
Make sure that the user-provided mode is always preferred (if it contains 't' or 'b') and extend list of text-only classes.