-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Deprecate literal json string input to read_json #53330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…literal strings to read_json
@mroeschke This looks like its going to be a pretty bulky PR after all of the unit tests are updated. Was hoping that someone could make sure that I'm on the right track before I finish updating the remainder of the unit tests? |
@@ -750,6 +740,8 @@ def read_json( | |||
}}\ | |||
' | |||
""" | |||
if isinstance(path_or_buf, str): | |||
raise TypeError("cannot pass literal json to pandas.read_json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can only add a warning for now and then with 3.0 (or 2.2?) remove support for json literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, that would be drastically different compared to the original ask in the GitHub issue. Though, that would follow along the same lines of what we have seen with DataFrame.concat
and DataFrame.append
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I wrote the bug as ENH
, my plan (although thanks for picking it up since I didn't get enough roundtoits until now), was to introduce a deprecation warning and then remove in an upcoming release. I think it's too user- facing to just remove immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wence- That makes perfect sense.
I may hit pause on this until for a day or two until we hear from mroeschke
or another reviewer for official direction. I don't want to steer too far away from the original acceptance criteria / guidance in the Feature Description, although I think that they'll probably echo what you're suggesting 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah @twoertwein and @wence- are correct here. We will need to add a warning to deprecate the current behavior of parsing string json in this PR before we can outright raise an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mroeschke Awesome! Thanks!
@@ -750,6 +740,8 @@ def read_json( | |||
}}\ | |||
' | |||
""" | |||
if isinstance(path_or_buf, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check is probably too restrictive. I think we want to allow
pd.read_json("path/to/file.json")
Going forward. We just don't want
pd.read_json("{json: object}") # or however it's spelt
So rather than checking here. The deprecation probably wants enforced in JsonReader
, perhaps like this?
diff --git a/pandas/io/json/_json.py b/pandas/io/json/_json.py
index 8775c65f14..78f541d3df 100644
--- a/pandas/io/json/_json.py
+++ b/pandas/io/json/_json.py
@@ -18,6 +18,7 @@ from typing import (
TypeVar,
overload,
)
+import warnings
import numpy as np
@@ -30,6 +31,7 @@ from pandas._libs.tslibs import iNaT
from pandas.compat._optional import import_optional_dependency
from pandas.errors import AbstractMethodError
from pandas.util._decorators import doc
+from pandas.util._exceptions import find_stack_level
from pandas.util._validators import check_dtype_backend
from pandas.core.dtypes.common import ensure_str
@@ -916,7 +918,14 @@ class JsonReader(abc.Iterator, Generic[FrameSeriesStrT]):
and not file_exists(filepath_or_buffer)
):
raise FileNotFoundError(f"File {filepath_or_buffer} does not exist")
-
+ else:
+ warnings.warn(
+ "Passing literal json to 'read_json' is deprecated and "
+ "will be removed in a future version. To read from a "
+ "literal string, wrap it in a 'StringIO' object.",
+ FutureWarning,
+ stacklevel=find_stack_level(),
+ )
return filepath_or_buffer
def _combine_lines(self, lines) -> str:
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the value in doing that. Though, that doesn't seem to be what our repo maintainers are going after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the value in doing that. Though, that doesn't seem to be what our repo maintainers are going after
I'm quite sure @mroeschke wants to allow os.PathLike
and str
that represent file paths (that would be consistent with read/to_csv). My understanding of the issue is resolving the ambiguity of whether a string is a path or a "JSON-string" (that has caused issues in the past). When making changes that break current behavior (disallowing JSON-strings), we have to have at least one release with a deprecation warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh okay! I was thinking that they would want to turn strings into StringIO objects! And yeah I definitely understand that we would want to have at least one release with a deprecation warning! 💯
Do we want to achieve the consistency as mentioned in #52271 (comment) in this PR? If not, maybe open a follow-up issue with the xref. |
read_json
#52271doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.