Skip to content

Commit 69e2f01

Browse files
committed
Add whatsnew entry
1 parent 39010eb commit 69e2f01

File tree

1 file changed

+38
-2
lines changed

1 file changed

+38
-2
lines changed

doc/source/whatsnew/v1.4.0.rst

+38-2
Original file line numberDiff line numberDiff line change
@@ -186,9 +186,45 @@ Now the float-dtype is respected. Since the common dtype for these DataFrames is
186186
187187
res
188188
189-
.. _whatsnew_140.notable_bug_fixes.notable_bug_fix3:
189+
.. _whatsnew_140.notable_bug_fixes.write_compliant_parquet_nested_type:
190190

191-
notable_bug_fix3
191+
Write compliant Parquet nested types if possible
192+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
193+
194+
When using :meth:`DataFrame.to_parquet` to write a DataFrame to Parquet, if any of the columns contained arrays
195+
of values the :module:`pyarrow` engine would write a non-compliant format. This behavior is now fixed when the installed
196+
version of PyArrow is at least ``4.0.0``.
197+
198+
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#nested-types
199+
200+
.. ipython:: python
201+
202+
import pandas as pd
203+
import pyarrow.parquet as pq
204+
205+
df = pd.DataFrame({"int_array_col": [[1, 2, 3], [4, 5, 6]]})
206+
df.to_parquet("/tmp/sample_df")
207+
parquet_table = pq.read_table("/tmp/sample_df")
208+
209+
*Previous behavior*:
210+
211+
.. code-block:: ipython
212+
213+
In [4]: parquet_table.schema.types
214+
Out[4]:
215+
[ListType(list<item: int64>)]
216+
217+
*New behavior*:
218+
219+
.. ipython:: python
220+
221+
In [4]: parquet_table.schema.types
222+
Out[4]:
223+
[ListType(list<element: int64>)]
224+
225+
.. _whatsnew_140.notable_bug_fixes.notable_bug_fix4:
226+
227+
notable_bug_fix4
192228
^^^^^^^^^^^^^^^^
193229

194230
.. ---------------------------------------------------------------------------

0 commit comments

Comments
 (0)