-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: Make read_json less stateful #59124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if result: | ||
self.obj = obj | ||
def _try_convert_types(self, obj: Series) -> Series: | ||
obj, _ = self._try_convert_data("data", obj, convert_dates=self.convert_dates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing code is pretty strange, but wouldn't this return something different in the case where the result
value was False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think obj
should be the same if result value was False i.e. the conversion was a no-op. I'm relying on that fact here that a no-op result should just be passed through here with _try_convert_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK - definitely not a fan of relying on the implementation of another function like that...but that can be cleaned up in a separate PR
pandas/io/json/_json.py
Outdated
return False | ||
|
||
self._process_converter(lambda col, c: self._try_convert_to_date(c), filt=is_ok) | ||
obj = self._process_converter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make less stateless while maintaining these as separate functions? I find the new code harder to read, especially with the subtle difference in arguments to self._process_converter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing. I just pushed a change to remove _process_converter
and make things simpler
@@ -792,7 +792,7 @@ def test_frame_from_json_precise_float(self): | |||
|
|||
def test_typ(self): | |||
s = Series(range(6), index=["a", "b", "c", "d", "e", "f"], dtype="int64") | |||
result = read_json(StringIO(s.to_json()), typ=None) | |||
result = read_json(StringIO(s.to_json()), typ="series") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the tests need to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test didn't align with the docs/typing of typ
accepting None
. I added validation in this PR to ensure this case fails now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK. That's super strange we allowed that to begin with...but hopefully harmless taking it away
return obj | ||
return SeriesParser(json, **kwargs).parse() | ||
else: | ||
raise ValueError(f"{typ=} must be 'frame' or 'series'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add a bugfix note for this at least to have it visibile. Something to the effect of "read_json now raises if the typ argument is neither frame nor series"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice to have a whatsnew note, but otherwise lgtm
Also adds some validation for the
typ
keyword which aligns with the docs/typing