You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/io.rst
+10
Original file line number
Diff line number
Diff line change
@@ -1845,6 +1845,7 @@ is ``None``. To explicitly force ``Series`` parsing, pass ``typ=series``
1845
1845
seconds, milliseconds, microseconds or nanoseconds respectively.
1846
1846
- ``lines`` : reads file as one json object per line.
1847
1847
- ``encoding`` : The encoding to use to decode py3 bytes.
1848
+
- ``chunksize`` : when used in combination with ``lines=True``, return a JsonReader which reads in ``chunksize`` lines per iteration.
1848
1849
1849
1850
The parser will raise one of ``ValueError/TypeError/AssertionError`` if the JSON is not parseable.
1850
1851
@@ -2049,6 +2050,10 @@ Line delimited json
2049
2050
pandas is able to read and write line-delimited json files that are common in data processing pipelines
2050
2051
using Hadoop or Spark.
2051
2052
2053
+
.. versionadded:: 0.21.0
2054
+
2055
+
For line-delimited json files, pandas can also return an iterator which reads in ``chunksize`` lines at a time. This can be useful for large files or to read from a stream.
2056
+
2052
2057
.. ipython:: python
2053
2058
2054
2059
jsonl ='''
@@ -2059,6 +2064,11 @@ using Hadoop or Spark.
2059
2064
df
2060
2065
df.to_json(orient='records', lines=True)
2061
2066
2067
+
# reader is an iterator that returns `chunksize` lines each iteration
0 commit comments