I've finally got around to revisiting this. I've added support to my fork of ujson for different output formats when encoding pandas data types:

In [4]: df = DataFrame([[1,2,3], [4,5,6]], index=['a', 'b'], columns=['x', 'y', 'z']) 

In [5]: ujson.encode(df, format="headers")
Out[5]: '{"columns":["x","y","z"],"index":["a","b"],"data":[[1,2,3],[4,5,6]]}'

In [6]: ujson.encode(df, format="records")
Out[6]: '[{"x":1,"y":2,"z":3},{"x":4,"y":5,"z":6}]'

In [7]: ujson.encode(df, format="indexed")
Out[7]: '{"a":{"x":1,"y":2,"z":3},"b":{"x":4,"y":5,"z":6}}'

In [8]: ujson.encode(df, format="column_indexed")
Out[8]: '{"x":{"a":1,"b":4},"y":{"a":2,"b":5},"z":{"a":3,"b":6}}'

If format isn't specified encoding defaults to the column_indexed format as it matches the output of to_dict() and it can be given straight to the DataFrame constructor. All of the encoding / iteration is performed in ujson in C.

I've added similar support for Series and Index (although some of the formats don't suit them it tries to handle them sensibly)

In [9]: s = Series([10, 20, 30, 40, 50, 60], name="myseries", index=[6,7,8,9,10,15])

In [10]: ujson.encode(s, format="headers")
Out[10]: '{"name":"myseries","index":[6,7,8,9,10,15],"data":[10,20,30,40,50,60]}'

In [11]: ujson.encode(s, format="records")
Out[11]: '[10,20,30,40,50,60]'

In [12]: ujson.encode(s, format="indexed")
Out[12]: '{"6":10,"7":20,"8":30,"9":40,"10":50,"15":60}'

In [13]: ujson.encode(s, format="column_indexed")
Out[13]: '{"6":10,"7":20,"8":30,"9":40,"10":50,"15":60}'

In [14]: i = Index([23, 45, 18, 98, 43, 11], name="myindex")

In [15]: ujson.encode(i, format="headers")
Out[15]: '{"name":"myindex","data":[23,45,18,98,43,11]}'

In [16]: ujson.encode(i, format="records")
Out[16]: '[23,45,18,98,43,11]'

In [17]: ujson.encode(i, format="indexed")
Out[17]: '[23,45,18,98,43,11]'

In [18]: ujson.encode(i, format="column_indexed")
Out[18]: '[23,45,18,98,43,11]'

My next step is to integrate this into pandas but I'd welcome any comments. Are there values for the format argument that would fit better with existing pandas code?

Uh oh!

Add JSON export option for DataFrame #631

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions