-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
read_json support for orient="table" #19039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/io/json/table_schema.py
Outdated
@@ -103,6 +110,28 @@ def make_field(arr, dtype=None): | |||
return field | |||
|
|||
|
|||
def revert_field(field): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a docstring
pandas/io/json/table_schema.py
Outdated
def parse_table_schema(json, precise_float): | ||
""" | ||
Builds a DataFrame from a given schema | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand this docstring to discuss parameters and return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, document the NotImplementedError
cases.
|
||
@pytest.mark.parametrize("index_names", [[None, None], ['foo', 'bar']]) | ||
def test_multiindex(self, index_names): | ||
# GH 18912 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you adding the issue reference here only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only because the issue was where I pulled the code sample from. I was hesitant to include in the first place because the issue ends up touching on something larger than originally intended, namely that there is no read_json
support for orient="table"
. Would you rather remove or do you think it's worth moving that reference to all of the test cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you said, given that your PR adds an entire new class of tests, I actually think what you've done is fine (since the issue directly points to the test). I was looking for some clarification.
@@ -839,6 +839,9 @@ def _parse_no_numpy(self): | |||
elif orient == "index": | |||
self.obj = DataFrame( | |||
loads(json, precise_float=self.precise_float), dtype=None).T | |||
elif orient == 'table': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to docstring that supported was added for this option in 0.23.0
.
4319335
to
00a44b1
Compare
Codecov Report
@@ Coverage Diff @@
## master #19039 +/- ##
==========================================
+ Coverage 91.53% 91.54% +<.01%
==========================================
Files 148 148
Lines 48688 48729 +41
==========================================
+ Hits 44566 44607 +41
Misses 4122 4122
Continue to review full report at Codecov.
|
pandas/io/json/table_schema.py
Outdated
@@ -103,6 +108,40 @@ def make_field(arr, dtype=None): | |||
return field | |||
|
|||
|
|||
def revert_field(field): | |||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use triple-double quotes rather than single
pandas/io/json/table_schema.py
Outdated
@@ -103,6 +108,40 @@ def make_field(arr, dtype=None): | |||
return field | |||
|
|||
|
|||
def revert_field(field): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename make_field and this to something more descriptive. make
convert_json_typename_to_pandas (and the reverse)
pandas/io/json/table_schema.py
Outdated
Returns | ||
------- | ||
dtype | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an example of tz and categorical here
|
||
class TestTableOrientReader(object): | ||
|
||
def test_integer(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah would be best to paramaterize these tests
00a44b1
to
81f1bd2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comments, otherwise lgtm.
data = [1., 2., 3.] | ||
kinds = [pd.Series(pd.to_datetime(data), name='values'), | ||
pd.to_datetime(data)] | ||
for kind in kinds: | ||
result = make_field(kind) | ||
result = convert_pandas_type_to_json_field(kind) | ||
expected = {"name": "values", "type": 'datetime'} | ||
assert result == expected | ||
|
||
kinds = [pd.Series(pd.to_datetime(data, utc=True), name='values'), | ||
pd.to_datetime(data, utc=True)] | ||
for kind in kinds: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should come back later and parametrize these tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #19070 to handle this
|
||
class TestTableOrientReader(object): | ||
|
||
@pytest.mark.parametrize("inp,exp", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these first 2 can just be functions, don't need to be in this class (should put next to the tests for the convert_pandas_type_to_json)
81f1bd2
to
09bc53a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good on a quick skim @WillAyd, thanks.
doc/source/whatsnew/v0.23.0.txt
Outdated
@@ -145,6 +145,7 @@ Other Enhancements | |||
- ``Resampler`` objects now have a functioning :attr:`~pandas.core.resample.Resampler.pipe` method. | |||
Previously, calls to ``pipe`` were diverted to the ``mean`` method (:issue:`17905`). | |||
- :func:`~pandas.api.types.is_scalar` now returns ``True`` for ``DateOffset`` objects (:issue:`18943`). | |||
- :func:`read_json` now supports ``table`` as a value to the ``orient`` argument (:issue:`18912`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make this into a full example here if you want. People have wanted a roundtrippable JSON format for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll add that and re-push
326ba31
to
32ae80c
Compare
doc/source/whatsnew/v0.23.0.txt
Outdated
}, index=pd.Index(range(4), name='idx')) | ||
df | ||
|
||
Previous Behavior: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't use this as a previous example, yes its the same 'idea'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What don't you like about it as an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need the Previous Behavior at all, just show the orient='table' part. you can say that we currently can;t round trip but you don't really need to show this I think.
doc/source/whatsnew/v0.23.0.txt
Outdated
|
||
.. code-block:: ipython | ||
|
||
In [29]: df.to_json("test.json", orient='table') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be an ipython block
doc/source/whatsnew/v0.23.0.txt
Outdated
1 b 2018-01-02 2 b | ||
2 c 2018-01-03 3 c | ||
3 d 2018-01-04 4 c | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show the df dtypes as well
doc/source/whatsnew/v0.23.0.txt
Outdated
2 c 2018-01-03 3 c | ||
3 d 2018-01-04 4 c | ||
|
||
Please note that the string `index` is not supported with the round trip format, as it is used by default in ``write_json`` to indicate a missing index name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be in the doc-string / io.rst doc section as well (and add this example there too)
32ae80c
to
071f26c
Compare
071f26c
to
36cab57
Compare
thanks @WillAyd very nice! keep em coming! |
closes Roundtrip orient for JSON #9146
git diff upstream/master -u -- "*.py" | flake8 --diff
This is a starting point for
read_json
supportingorient='table'
for basic types. I'm explicitly raising in the instances that don't work, namely timezone aware datetimes andTimedelta
. I marked floats with no decimal values as an xfail (these are being converted toint64
by the_try_convert_data
method).Timedelta
would be best fixed in a separate change providing a constructor equivalent to theisoformat
method (see #19040).The new tests could also all be parametrized, but I've kept as is before going too far with this change in case of feedback