You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Similar to the :ref:`parquet <io.parquet>` format, the `ORC Format <https://orc.apache.org/>`__ is a binary columnar serialization
5565
-
for data frames. It is designed to make reading data frames efficient. pandas provides *only* a reader for the
5566
-
ORC format, :func:`~pandas.read_orc`. This requires the `pyarrow <https://arrow.apache.org/docs/python/>`__ library.
5565
+
for data frames. It is designed to make reading data frames efficient. pandas provides both the reader and the writer for the
5566
+
ORC format, :func:`~pandas.read_orc` and :func:`~pandas.DataFrame.to_orc`. This requires the `pyarrow <https://arrow.apache.org/docs/python/>`__ library.
5567
5567
5568
5568
.. warning::
5569
5569
5570
5570
* It is *highly recommended* to install pyarrow using conda due to some issues occurred by pyarrow.
5571
-
* :func:`~pandas.read_orc` is not supported on Windows yet, you can find valid environments on :ref:`install optional dependencies <install.warn_orc>`.
* :func:`~pandas.read_orc` and :func:`~pandas.DataFrame.to_orc` are not supported on Windows yet, you can find valid environments on :ref:`install optional dependencies <install.warn_orc>`.
5573
+
* For supported dtypes please refer to `supported ORC features in Arrow <https://arrow.apache.org/docs/cpp/orc.html#data-types>`__.
5574
+
* Currently timezones in datetime columns are not preserved when a dataframe is converted into ORC files.
0 commit comments