Skip to content

Commit 3d98544

Browse files
committed
Merge pull request #3876 from jreback/json_rt
ENH: JSON
2 parents cf0db6f + 740b10f commit 3d98544

File tree

4 files changed

+410
-183
lines changed

4 files changed

+410
-183
lines changed

doc/source/io.rst

+80-15
Original file line numberDiff line numberDiff line change
@@ -954,13 +954,21 @@ with optional parameters:
954954

955955
- path_or_buf : the pathname or buffer to write the output
956956
This can be ``None`` in which case a JSON string is returned
957-
- orient : The format of the JSON string, default is ``index`` for ``Series``, ``columns`` for ``DataFrame``
957+
- orient :
958958

959-
* split : dict like {index -> [index], columns -> [columns], data -> [values]}
960-
* records : list like [{column -> value}, ... , {column -> value}]
961-
* index : dict like {index -> {column -> value}}
962-
* columns : dict like {column -> {index -> value}}
963-
* values : just the values array
959+
Series :
960+
default is 'index', allowed values are: {'split','records','index'}
961+
962+
DataFrame :
963+
default is 'columns', allowed values are: {'split','records','index','columns','values'}
964+
965+
The format of the JSON string
966+
967+
* split : dict like {index -> [index], columns -> [columns], data -> [values]}
968+
* records : list like [{column -> value}, ... , {column -> value}]
969+
* index : dict like {index -> {column -> value}}
970+
* columns : dict like {column -> {index -> value}}
971+
* values : just the values array
964972

965973
- date_format : type of date conversion (epoch = epoch milliseconds, iso = ISO8601), default is epoch
966974
- double_precision : The number of decimal places to use when encoding floating point values, default 10.
@@ -989,6 +997,8 @@ Writing to a file, with a date index and a date column
989997
990998
dfj2 = dfj.copy()
991999
dfj2['date'] = Timestamp('20130101')
1000+
dfj2['ints'] = range(5)
1001+
dfj2['bools'] = True
9921002
dfj2.index = date_range('20130101',periods=5)
9931003
dfj2.to_json('test.json')
9941004
open('test.json').read()
@@ -1005,31 +1015,86 @@ is ``None``. To explicity force ``Series`` parsing, pass ``typ=series``
10051015
is expected. For instance, a local file could be
10061016
file ://localhost/path/to/table.json
10071017
- typ : type of object to recover (series or frame), default 'frame'
1008-
- orient : The format of the JSON string, one of the following
1018+
- orient :
1019+
1020+
Series :
1021+
default is 'index', allowed values are: {'split','records','index'}
1022+
1023+
DataFrame :
1024+
default is 'columns', allowed values are: {'split','records','index','columns','values'}
1025+
1026+
The format of the JSON string
10091027

1010-
* split : dict like {index -> [index], name -> name, data -> [values]}
1011-
* records : list like [value, ... , value]
1012-
* index : dict like {index -> value}
1028+
* split : dict like {index -> [index], columns -> [columns], data -> [values]}
1029+
* records : list like [{column -> value}, ... , {column -> value}]
1030+
* index : dict like {index -> {column -> value}}
1031+
* columns : dict like {column -> {index -> value}}
1032+
* values : just the values array
10131033

1014-
- dtype : dtype of the resulting object
1015-
- numpy : direct decoding to numpy arrays. default True but falls back to standard decoding if a problem occurs.
1016-
- parse_dates : a list of columns to parse for dates; If True, then try to parse datelike columns, default is False
1034+
- dtype : if True, infer dtypes, if a dict of column to dtype, then use those, if False, then don't infer dtypes at all, default is True, apply only to the data
1035+
- convert_axes : boolean, try to convert the axes to the proper dtypes, default is True
1036+
- convert_dates : a list of columns to parse for dates; If True, then try to parse datelike columns, default is True
10171037
- keep_default_dates : boolean, default True. If parsing dates, then parse the default datelike columns
1038+
- numpy: direct decoding to numpy arrays. default is False;
1039+
Note that the JSON ordering **MUST** be the same for each term if ``numpy=True``
10181040

10191041
The parser will raise one of ``ValueError/TypeError/AssertionError`` if the JSON is
10201042
not parsable.
10211043

1044+
The default of ``convert_axes=True``, ``dtype=True``, and ``convert_dates=True`` will try to parse the axes, and all of the data
1045+
into appropriate types, including dates. If you need to override specific dtypes, pass a dict to ``dtype``. ``convert_axes`` should only
1046+
be set to ``False`` if you need to preserve string-like numbers (e.g. '1', '2') in an axes.
1047+
1048+
.. warning::
1049+
1050+
When reading JSON data, automatic coercing into dtypes has some quirks:
1051+
1052+
* an index can be in a different order, that is the returned order is not guaranteed to be the same as before serialization
1053+
* a column that was ``float`` data can safely be converted to ``integer``, e.g. a column of ``1.``
1054+
* bool columns will be converted to ``integer`` on reconstruction
1055+
1056+
Thus there are times where you may want to specify specific dtypes via the ``dtype`` keyword argument.
1057+
10221058
Reading from a JSON string
10231059

10241060
.. ipython:: python
10251061
10261062
pd.read_json(json)
10271063
1028-
Reading from a file, parsing dates
1064+
Reading from a file
1065+
1066+
.. ipython:: python
1067+
1068+
pd.read_json('test.json')
1069+
1070+
Don't convert any data (but still convert axes and dates)
1071+
1072+
.. ipython:: python
1073+
1074+
pd.read_json('test.json',dtype=object).dtypes
1075+
1076+
Specify how I want to convert data
1077+
1078+
.. ipython:: python
1079+
1080+
pd.read_json('test.json',dtype={'A' : 'float32', 'bools' : 'int8'}).dtypes
1081+
1082+
I like my string indicies
10291083

10301084
.. ipython:: python
10311085
1032-
pd.read_json('test.json',parse_dates=True)
1086+
si = DataFrame(np.zeros((4, 4)),
1087+
columns=range(4),
1088+
index=[str(i) for i in range(4)])
1089+
si
1090+
si.index
1091+
si.columns
1092+
json = si.to_json()
1093+
1094+
sij = pd.read_json(json,convert_axes=False)
1095+
sij
1096+
sij.index
1097+
sij.columns
10331098
10341099
.. ipython:: python
10351100
:suppress:

pandas/core/generic.py

+10-2
Original file line numberDiff line numberDiff line change
@@ -522,8 +522,15 @@ def to_json(self, path_or_buf=None, orient=None, date_format='epoch',
522522
----------
523523
path_or_buf : the path or buffer to write the result string
524524
if this is None, return a StringIO of the converted string
525-
orient : {'split', 'records', 'index', 'columns', 'values'},
526-
default is 'index' for Series, 'columns' for DataFrame
525+
orient :
526+
527+
Series :
528+
default is 'index'
529+
allowed values are: {'split','records','index'}
530+
531+
DataFrame :
532+
default is 'columns'
533+
allowed values are: {'split','records','index','columns','values'}
527534
528535
The format of the JSON string
529536
split : dict like
@@ -532,6 +539,7 @@ def to_json(self, path_or_buf=None, orient=None, date_format='epoch',
532539
index : dict like {index -> {column -> value}}
533540
columns : dict like {column -> {index -> value}}
534541
values : just the values array
542+
535543
date_format : type of date conversion (epoch = epoch milliseconds, iso = ISO8601),
536544
default is epoch
537545
double_precision : The number of decimal places to use when encoding

0 commit comments

Comments
 (0)