Skip to content

ENH: JSON #3876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 13, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 80 additions & 15 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -954,13 +954,21 @@ with optional parameters:

- path_or_buf : the pathname or buffer to write the output
This can be ``None`` in which case a JSON string is returned
- orient : The format of the JSON string, default is ``index`` for ``Series``, ``columns`` for ``DataFrame``
- orient :

* split : dict like {index -> [index], columns -> [columns], data -> [values]}
* records : list like [{column -> value}, ... , {column -> value}]
* index : dict like {index -> {column -> value}}
* columns : dict like {column -> {index -> value}}
* values : just the values array
Series :
default is 'index', allowed values are: {'split','records','index'}

DataFrame :
default is 'columns', allowed values are: {'split','records','index','columns','values'}

The format of the JSON string

* split : dict like {index -> [index], columns -> [columns], data -> [values]}
* records : list like [{column -> value}, ... , {column -> value}]
* index : dict like {index -> {column -> value}}
* columns : dict like {column -> {index -> value}}
* values : just the values array

- date_format : type of date conversion (epoch = epoch milliseconds, iso = ISO8601), default is epoch
- double_precision : The number of decimal places to use when encoding floating point values, default 10.
Expand Down Expand Up @@ -989,6 +997,8 @@ Writing to a file, with a date index and a date column

dfj2 = dfj.copy()
dfj2['date'] = Timestamp('20130101')
dfj2['ints'] = range(5)
dfj2['bools'] = True
dfj2.index = date_range('20130101',periods=5)
dfj2.to_json('test.json')
open('test.json').read()
Expand All @@ -1005,31 +1015,86 @@ is ``None``. To explicity force ``Series`` parsing, pass ``typ=series``
is expected. For instance, a local file could be
file ://localhost/path/to/table.json
- typ : type of object to recover (series or frame), default 'frame'
- orient : The format of the JSON string, one of the following
- orient :

Series :
default is 'index', allowed values are: {'split','records','index'}

DataFrame :
default is 'columns', allowed values are: {'split','records','index','columns','values'}

The format of the JSON string

* split : dict like {index -> [index], name -> name, data -> [values]}
* records : list like [value, ... , value]
* index : dict like {index -> value}
* split : dict like {index -> [index], columns -> [columns], data -> [values]}
* records : list like [{column -> value}, ... , {column -> value}]
* index : dict like {index -> {column -> value}}
* columns : dict like {column -> {index -> value}}
* values : just the values array

- dtype : dtype of the resulting object
- numpy : direct decoding to numpy arrays. default True but falls back to standard decoding if a problem occurs.
- parse_dates : a list of columns to parse for dates; If True, then try to parse datelike columns, default is False
- dtype : if True, infer dtypes, if a dict of column to dtype, then use those, if False, then don't infer dtypes at all, default is True, apply only to the data
- convert_axes : boolean, try to convert the axes to the proper dtypes, default is True
- convert_dates : a list of columns to parse for dates; If True, then try to parse datelike columns, default is True
- keep_default_dates : boolean, default True. If parsing dates, then parse the default datelike columns
- numpy: direct decoding to numpy arrays. default is False;
Note that the JSON ordering **MUST** be the same for each term if ``numpy=True``

The parser will raise one of ``ValueError/TypeError/AssertionError`` if the JSON is
not parsable.

The default of ``convert_axes=True``, ``dtype=True``, and ``convert_dates=True`` will try to parse the axes, and all of the data
into appropriate types, including dates. If you need to override specific dtypes, pass a dict to ``dtype``. ``convert_axes`` should only
be set to ``False`` if you need to preserve string-like numbers (e.g. '1', '2') in an axes.

.. warning::

When reading JSON data, automatic coercing into dtypes has some quirks:

* an index can be in a different order, that is the returned order is not guaranteed to be the same as before serialization
* a column that was ``float`` data can safely be converted to ``integer``, e.g. a column of ``1.``
* bool columns will be converted to ``integer`` on reconstruction

Thus there are times where you may want to specify specific dtypes via the ``dtype`` keyword argument.

Reading from a JSON string

.. ipython:: python

pd.read_json(json)

Reading from a file, parsing dates
Reading from a file

.. ipython:: python

pd.read_json('test.json')

Don't convert any data (but still convert axes and dates)

.. ipython:: python

pd.read_json('test.json',dtype=object).dtypes

Specify how I want to convert data

.. ipython:: python

pd.read_json('test.json',dtype={'A' : 'float32', 'bools' : 'int8'}).dtypes

I like my string indicies

.. ipython:: python

pd.read_json('test.json',parse_dates=True)
si = DataFrame(np.zeros((4, 4)),
columns=range(4),
index=[str(i) for i in range(4)])
si
si.index
si.columns
json = si.to_json()

sij = pd.read_json(json,convert_axes=False)
sij
sij.index
sij.columns

.. ipython:: python
:suppress:
Expand Down
12 changes: 10 additions & 2 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -507,8 +507,15 @@ def to_json(self, path_or_buf=None, orient=None, date_format='epoch',
----------
path_or_buf : the path or buffer to write the result string
if this is None, return a StringIO of the converted string
orient : {'split', 'records', 'index', 'columns', 'values'},
default is 'index' for Series, 'columns' for DataFrame
orient :

Series :
default is 'index'
allowed values are: {'split','records','index'}

DataFrame :
default is 'columns'
allowed values are: {'split','records','index','columns','values'}

The format of the JSON string
split : dict like
Expand All @@ -517,6 +524,7 @@ def to_json(self, path_or_buf=None, orient=None, date_format='epoch',
index : dict like {index -> {column -> value}}
columns : dict like {column -> {index -> value}}
values : just the values array

date_format : type of date conversion (epoch = epoch milliseconds, iso = ISO8601),
default is epoch
double_precision : The number of decimal places to use when encoding
Expand Down
Loading