Skip to content

WEB: Use mambaforge for the getting started installation instructions? #48220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 30, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions doc/source/development/contributing_codebase.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ If you want to run checks on all recently committed files on upstream/main you c

without needing to have done ``pre-commit install`` beforehand.

.. note::

You may want to periodically run ``pre-commit gc``, to clean up repos
which are no longer used.

.. note::

If you have conflicting installations of ``virtualenv``, then you may get an
Expand Down
125 changes: 57 additions & 68 deletions doc/source/development/contributing_environment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,53 +16,8 @@ locally before pushing your changes.
:local:


Creating an environment using Docker
--------------------------------------

Instead of manually setting up a development environment, you can use `Docker
<https://docs.docker.com/get-docker/>`_ to automatically create the environment with just several
commands. pandas provides a ``DockerFile`` in the root directory to build a Docker image
with a full pandas development environment.

**Docker Commands**

Build the Docker image::

# Build the image pandas-yourname-env
docker build --tag pandas-yourname-env .
# Or build the image by passing your GitHub username to use your own fork
docker build --build-arg gh_username=yourname --tag pandas-yourname-env .

Run Container::

# Run a container and bind your local repo to the container
docker run -it -w /home/pandas --rm -v path-to-local-pandas-repo:/home/pandas pandas-yourname-env

.. note::
If you bind your local repo for the first time, you have to build the C extensions afterwards.
Run the following command inside the container::

python setup.py build_ext -j 4

You need to rebuild the C extensions anytime the Cython code in ``pandas/_libs`` changes.
This most frequently occurs when changing or merging branches.

*Even easier, you can integrate Docker with the following IDEs:*

**Visual Studio Code**

You can use the DockerFile to launch a remote session with Visual Studio Code,
a popular free IDE, using the ``.devcontainer.json`` file.
See https://code.visualstudio.com/docs/remote/containers for details.

**PyCharm (Professional)**

Enable Docker support and use the Services tool window to build and manage images as well as
run and interact with containers.
See https://www.jetbrains.com/help/pycharm/docker.html for details.

Creating an environment without Docker
---------------------------------------
Option 1: creating an environment without Docker
------------------------------------------------

Installing a C compiler
~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -142,14 +97,13 @@ compiler installation instructions.

Let us know if you have any difficulties by opening an issue or reaching out on `Gitter <https://gitter.im/pydata/pandas/>`_.

Creating a Python environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Option 1a: using mamba (recommended)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Now create an isolated pandas development environment:

* Install either `Anaconda <https://www.anaconda.com/products/individual>`_, `miniconda
<https://docs.conda.io/en/latest/miniconda.html>`_, or `miniforge <https://github.com/conda-forge/miniforge>`_
* Make sure your conda is up to date (``conda update conda``)
* Install `mamba <https://mamba.readthedocs.io/en/latest/index.html>`_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some references to conda in the section above this one. Should those also be changed or maybe linked directly to this section for clarity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, good point, thanks

the mamba docs have conda activate, but mamba activate also works fine, I've replaced conda with mamba throughout to avoid confusion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally use this link to install mamba: https://github.com/conda-forge/miniforge#mambaforge

In my opinion this is what users should use, and it's not so obvious to find. Users may end up installing Anaconda, to then install mamba... which doesn't sound ideal.

But if you have a preference for the mamba home docs, also fine with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agree that that's what people should use - it's the first link to show up on https://mamba.readthedocs.io/en/latest/installation.html , so let's link that (which'll probably be the most up-to-date reference)?

* Make sure your mamba is up to date (``mamba update mamba``)
* Make sure that you have :any:`cloned the repository <contributing.forking>`
* ``cd`` to the pandas source directory

Expand All @@ -162,12 +116,9 @@ We'll now kick off a three-step process:
.. code-block:: none

# Create and activate the build environment
conda env create -f environment.yml
mamba env create -f environment.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -f environment.yml is not needed. I remember someone wanted to have it anyway at that time (not sure why), but if that's not the case anymore, I'd remove it and make things simpler.

conda activate pandas-dev

# or with older versions of Anaconda:
source activate pandas-dev

# Build and install pandas
python setup.py build_ext -j 4
python -m pip install -e . --no-build-isolation --no-use-pep517
Expand All @@ -176,27 +127,20 @@ At this point you should be able to import pandas from your locally built versio

$ python
>>> import pandas
>>> print(pandas.__version__)
0.22.0.dev0+29.g4ad6d4d74
>>> print(pandas.__version__) # note: the exact output may differ
1.5.0.dev0+1355.ge65a30e3eb.dirty

This will create the new environment, and not touch any of your existing environments,
nor any existing Python installation.

To view your environments::

conda info -e

To return to your root environment::

conda deactivate

See the full conda docs `here <https://conda.io/projects/conda/en/latest/>`__.
Option 1b: using pip
~~~~~~~~~~~~~~~~~~~~


Creating a Python environment (pip)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you aren't using conda for your development environment, follow these instructions.
If you aren't using mamba for your development environment, follow these instructions.
You'll need to have at least the :ref:`minimum Python version <install.version>` that pandas supports.
You also need to have ``setuptools`` 51.0.0 or later to build pandas.

Expand Down Expand Up @@ -268,3 +212,48 @@ should already exist.
# Build and install pandas
python setup.py build_ext -j 4
python -m pip install -e . --no-build-isolation --no-use-pep517

Option 2: creating an environment using Docker
----------------------------------------------

Instead of manually setting up a development environment, you can use `Docker
<https://docs.docker.com/get-docker/>`_ to automatically create the environment with just several
commands. pandas provides a ``DockerFile`` in the root directory to build a Docker image
with a full pandas development environment.

**Docker Commands**

Build the Docker image::

# Build the image pandas-yourname-env
docker build --tag pandas-yourname-env .
# Or build the image by passing your GitHub username to use your own fork
docker build --build-arg gh_username=yourname --tag pandas-yourname-env .

Run Container::

# Run a container and bind your local repo to the container
docker run -it -w /home/pandas --rm -v path-to-local-pandas-repo:/home/pandas pandas-yourname-env

.. note::
If you bind your local repo for the first time, you have to build the C extensions afterwards.
Run the following command inside the container::

python setup.py build_ext -j 4

You need to rebuild the C extensions anytime the Cython code in ``pandas/_libs`` changes.
This most frequently occurs when changing or merging branches.

*Even easier, you can integrate Docker with the following IDEs:*

**Visual Studio Code**

You can use the DockerFile to launch a remote session with Visual Studio Code,
a popular free IDE, using the ``.devcontainer.json`` file.
See https://code.visualstudio.com/docs/remote/containers for details.

**PyCharm (Professional)**

Enable Docker support and use the Services tool window to build and manage images as well as
run and interact with containers.
See https://www.jetbrains.com/help/pycharm/docker.html for details.