Skip to content

Backport PR #48220: WEB: Use mambaforge for the getting started installation instructions? #49211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions doc/source/development/contributing_codebase.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ If you want to run checks on all recently committed files on upstream/main you c

without needing to have done ``pre-commit install`` beforehand.

.. note::

You may want to periodically run ``pre-commit gc``, to clean up repos
which are no longer used.

.. note::

If you have conflicting installations of ``virtualenv``, then you may get an
Expand Down
165 changes: 78 additions & 87 deletions doc/source/development/contributing_environment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,63 +16,8 @@ locally before pushing your changes.
:local:


Creating an environment using Docker
--------------------------------------

Instead of manually setting up a development environment, you can use `Docker
<https://docs.docker.com/get-docker/>`_ to automatically create the environment with just several
commands. pandas provides a ``DockerFile`` in the root directory to build a Docker image
with a full pandas development environment.

**Docker Commands**

Build the Docker image::

# Build the image pandas-yourname-env
docker build --tag pandas-yourname-env .
# Or build the image by passing your GitHub username to use your own fork
docker build --build-arg gh_username=yourname --tag pandas-yourname-env .

Run Container::

# Run a container and bind your local repo to the container
docker run -it -w /home/pandas --rm -v path-to-local-pandas-repo:/home/pandas pandas-yourname-env

Then a ``pandas-dev`` virtual environment will be available with all the development dependencies.

.. code-block:: shell

root@... :/home/pandas# conda env list
# conda environments:
#
base * /opt/conda
pandas-dev /opt/conda/envs/pandas-dev

.. note::
If you bind your local repo for the first time, you have to build the C extensions afterwards.
Run the following command inside the container::

python setup.py build_ext -j 4

You need to rebuild the C extensions anytime the Cython code in ``pandas/_libs`` changes.
This most frequently occurs when changing or merging branches.

*Even easier, you can integrate Docker with the following IDEs:*

**Visual Studio Code**

You can use the DockerFile to launch a remote session with Visual Studio Code,
a popular free IDE, using the ``.devcontainer.json`` file.
See https://code.visualstudio.com/docs/remote/containers for details.

**PyCharm (Professional)**

Enable Docker support and use the Services tool window to build and manage images as well as
run and interact with containers.
See https://www.jetbrains.com/help/pycharm/docker.html for details.

Creating an environment without Docker
---------------------------------------
Option 1: creating an environment without Docker
------------------------------------------------

Installing a C compiler
~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -82,9 +27,9 @@ operations. To install pandas from source, you need to compile these C
extensions, which means you need a C compiler. This process depends on which
platform you're using.

If you have setup your environment using ``conda``, the packages ``c-compiler``
If you have setup your environment using :ref:`mamba <contributing.mamba>`, the packages ``c-compiler``
and ``cxx-compiler`` will install a fitting compiler for your platform that is
compatible with the remaining conda packages. On Windows and macOS, you will
compatible with the remaining mamba packages. On Windows and macOS, you will
also need to install the SDKs as they have to be distributed separately.
These packages will automatically be installed by using the ``pandas``
``environment.yml`` file.
Expand Down Expand Up @@ -117,16 +62,16 @@ To setup the right paths on the commandline, call

**macOS**

To use the ``conda``-based compilers, you will need to install the
To use the :ref:`mamba <contributing.mamba>`-based compilers, you will need to install the
Developer Tools using ``xcode-select --install``. Otherwise
information about compiler installation can be found here:
https://devguide.python.org/setup/#macos

**Linux**

For Linux-based ``conda`` installations, you won't have to install any
additional components outside of the conda environment. The instructions
below are only needed if your setup isn't based on conda environments.
For Linux-based :ref:`mamba <contributing.mamba>` installations, you won't have to install any
additional components outside of the mamba environment. The instructions
below are only needed if your setup isn't based on mamba environments.

Some Linux distributions will come with a pre-installed C compiler. To find out
which compilers (and versions) are installed on your system::
Expand All @@ -153,14 +98,15 @@ compiler installation instructions.
Let us know if you have any difficulties by opening an issue or reaching out on our contributor
community :ref:`Slack <community.slack>`.

Creating a Python environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. _contributing.mamba:

Option 1a: using mamba (recommended)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Now create an isolated pandas development environment:

* Install either `Anaconda <https://www.anaconda.com/products/individual>`_, `miniconda
<https://docs.conda.io/en/latest/miniconda.html>`_, or `miniforge <https://github.com/conda-forge/miniforge>`_
* Make sure your conda is up to date (``conda update conda``)
* Install `mamba <https://mamba.readthedocs.io/en/latest/installation.html>`_
* Make sure your mamba is up to date (``mamba update mamba``)
* Make sure that you have :any:`cloned the repository <contributing.forking>`
* ``cd`` to the pandas source directory

Expand All @@ -173,11 +119,8 @@ We'll now kick off a three-step process:
.. code-block:: none

# Create and activate the build environment
conda env create -f environment.yml
conda activate pandas-dev

# or with older versions of Anaconda:
source activate pandas-dev
mamba env create
mamba activate pandas-dev

# Build and install pandas
python setup.py build_ext -j 4
Expand All @@ -187,27 +130,20 @@ At this point you should be able to import pandas from your locally built versio

$ python
>>> import pandas
>>> print(pandas.__version__)
0.22.0.dev0+29.g4ad6d4d74
>>> print(pandas.__version__) # note: the exact output may differ
1.5.0.dev0+1355.ge65a30e3eb.dirty

This will create the new environment, and not touch any of your existing environments,
nor any existing Python installation.

To view your environments::

conda info -e

To return to your root environment::

conda deactivate

See the full conda docs `here <https://conda.io/projects/conda/en/latest/>`__.
mamba deactivate

Option 1b: using pip
~~~~~~~~~~~~~~~~~~~~

Creating a Python environment (pip)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you aren't using conda for your development environment, follow these instructions.
If you aren't using mamba for your development environment, follow these instructions.
You'll need to have at least the :ref:`minimum Python version <install.version>` that pandas supports.
You also need to have ``setuptools`` 51.0.0 or later to build pandas.

Expand Down Expand Up @@ -258,7 +194,7 @@ Consult the docs for setting up pyenv `here <https://github.com/pyenv/pyenv>`__.

Below is a brief overview on how to set-up a virtual environment with Powershell
under Windows. For details please refer to the
`official virtualenv user guide <https://virtualenv.pypa.io/en/latest/user_guide.html#activators>`__
`official virtualenv user guide <https://virtualenv.pypa.io/en/latest/user_guide.html#activators>`__.

Use an ENV_DIR of your choice. We'll use ~\\virtualenvs\\pandas-dev where
'~' is the folder pointed to by either $env:USERPROFILE (Powershell) or
Expand All @@ -279,3 +215,58 @@ should already exist.
# Build and install pandas
python setup.py build_ext -j 4
python -m pip install -e . --no-build-isolation --no-use-pep517

Option 2: creating an environment using Docker
----------------------------------------------

Instead of manually setting up a development environment, you can use `Docker
<https://docs.docker.com/get-docker/>`_ to automatically create the environment with just several
commands. pandas provides a ``DockerFile`` in the root directory to build a Docker image
with a full pandas development environment.

**Docker Commands**

Build the Docker image::

# Build the image pandas-yourname-env
docker build --tag pandas-yourname-env .
# Or build the image by passing your GitHub username to use your own fork
docker build --build-arg gh_username=yourname --tag pandas-yourname-env .

Run Container::

# Run a container and bind your local repo to the container
docker run -it -w /home/pandas --rm -v path-to-local-pandas-repo:/home/pandas pandas-yourname-env

Then a ``pandas-dev`` virtual environment will be available with all the development dependencies.

.. code-block:: shell

root@... :/home/pandas# conda env list
# conda environments:
#
base * /opt/conda
pandas-dev /opt/conda/envs/pandas-dev

.. note::
If you bind your local repo for the first time, you have to build the C extensions afterwards.
Run the following command inside the container::

python setup.py build_ext -j 4

You need to rebuild the C extensions anytime the Cython code in ``pandas/_libs`` changes.
This most frequently occurs when changing or merging branches.

*Even easier, you can integrate Docker with the following IDEs:*

**Visual Studio Code**

You can use the DockerFile to launch a remote session with Visual Studio Code,
a popular free IDE, using the ``.devcontainer.json`` file.
See https://code.visualstudio.com/docs/remote/containers for details.

**PyCharm (Professional)**

Enable Docker support and use the Services tool window to build and manage images as well as
run and interact with containers.
See https://www.jetbrains.com/help/pycharm/docker.html for details.