Skip to content

Commit 69097c7

Browse files
devin-petersohnKevin D Smith
authored and
Kevin D Smith
committed
DOC: Replace pandas on Ray in ecosystem.rst with Modin (pandas-dev#37249)
1 parent 881ccdc commit 69097c7

File tree

1 file changed

+17
-10
lines changed

1 file changed

+17
-10
lines changed

doc/source/ecosystem.rst

+17-10
Original file line numberDiff line numberDiff line change
@@ -376,6 +376,23 @@ Dask-ML enables parallel and distributed machine learning using Dask alongside e
376376

377377
Koalas provides a familiar pandas DataFrame interface on top of Apache Spark. It enables users to leverage multi-cores on one machine or a cluster of machines to speed up or scale their DataFrame code.
378378

379+
`Modin <https://github.com/modin-project/modin>`__
380+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
381+
382+
The ``modin.pandas`` DataFrame is a parallel and distributed drop-in replacement
383+
for pandas. This means that you can use Modin with existing pandas code or write
384+
new code with the existing pandas API. Modin can leverage your entire machine or
385+
cluster to speed up and scale your pandas workloads, including traditionally
386+
time-consuming tasks like ingesting data (``read_csv``, ``read_excel``,
387+
``read_parquet``, etc.).
388+
389+
.. code:: python
390+
391+
# import pandas as pd
392+
import modin.pandas as pd
393+
394+
df = pd.read_csv("big.csv") # use all your cores!
395+
379396
`Odo <http://odo.pydata.org>`__
380397
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
381398

@@ -400,16 +417,6 @@ If also displays progress bars.
400417
# df.apply(func)
401418
df.parallel_apply(func)
402419
403-
`Ray <https://ray.readthedocs.io/en/latest/pandas_on_ray.html>`__
404-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
405-
406-
pandas on Ray is an early stage DataFrame library that wraps pandas and transparently distributes the data and computation. The user does not need to know how many cores their system has, nor do they need to specify how to distribute the data. In fact, users can continue using their previous pandas notebooks while experiencing a considerable speedup from pandas on Ray, even on a single machine. Only a modification of the import statement is needed, as we demonstrate below. Once you’ve changed your import statement, you’re ready to use pandas on Ray just like you would pandas.
407-
408-
.. code:: python
409-
410-
# import pandas as pd
411-
import ray.dataframe as pd
412-
413420
414421
`Vaex <https://docs.vaex.io/>`__
415422
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)