diff --git a/doc/source/development/contributing_environment.rst b/doc/source/development/contributing_environment.rst index b79fe58c68e4b..942edd863a19a 100644 --- a/doc/source/development/contributing_environment.rst +++ b/doc/source/development/contributing_environment.rst @@ -15,24 +15,11 @@ locally before pushing your changes. It's recommended to also install the :ref:` .. contents:: Table of contents: :local: +Step 1: install a C compiler +---------------------------- -Option 1: creating an environment without Docker ------------------------------------------------- - -Installing a C compiler -~~~~~~~~~~~~~~~~~~~~~~~ - -pandas uses C extensions (mostly written using Cython) to speed up certain -operations. To install pandas from source, you need to compile these C -extensions, which means you need a C compiler. This process depends on which -platform you're using. - -If you have setup your environment using :ref:`mamba `, the packages ``c-compiler`` -and ``cxx-compiler`` will install a fitting compiler for your platform that is -compatible with the remaining mamba packages. On Windows and macOS, you will -also need to install the SDKs as they have to be distributed separately. -These packages will automatically be installed by using the ``pandas`` -``environment.yml`` file. +How to do this will depend on your platform. If you choose to user ``Docker`` +in the next step, then you can skip this step. **Windows** @@ -48,6 +35,9 @@ You will need `Build Tools for Visual Studio 2022 Alternatively, you can install the necessary components on the commandline using `vs_BuildTools.exe `_ +Alternatively, you could use the `WSL `_ +and consult the ``Linux`` instructions below. + **macOS** To use the :ref:`mamba `-based compilers, you will need to install the @@ -71,38 +61,30 @@ which compilers (and versions) are installed on your system:: `GCC (GNU Compiler Collection) `_, is a widely used compiler, which supports C and a number of other languages. If GCC is listed -as an installed compiler nothing more is required. If no C compiler is -installed (or you wish to install a newer version) you can install a compiler -(GCC in the example code below) with:: +as an installed compiler nothing more is required. - # for recent Debian/Ubuntu: - sudo apt install build-essential - # for Red Had/RHEL/CentOS/Fedora - yum groupinstall "Development Tools" - -For other Linux distributions, consult your favorite search engine for -compiler installation instructions. +If no C compiler is installed, or you wish to upgrade, or you're using a different +Linux distribution, consult your favorite search engine for compiler installation/update +instructions. Let us know if you have any difficulties by opening an issue or reaching out on our contributor community :ref:`Slack `. -.. _contributing.mamba: - -Option 1a: using mamba (recommended) -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Step 2: create an isolated environment +---------------------------------------- -Now create an isolated pandas development environment: +Before we begin, please: -* Install `mamba `_ -* Make sure your mamba is up to date (``mamba update mamba``) * Make sure that you have :any:`cloned the repository ` * ``cd`` to the pandas source directory -We'll now kick off a three-step process: +.. _contributing.mamba: + +Option 1: using mamba (recommended) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -1. Install the build dependencies -2. Build and install pandas -3. Install the optional dependencies +* Install `mamba `_ +* Make sure your mamba is up to date (``mamba update mamba``) .. code-block:: none @@ -110,28 +92,9 @@ We'll now kick off a three-step process: mamba env create --file environment.yml mamba activate pandas-dev - # Build and install pandas - python setup.py build_ext -j 4 - python -m pip install -e . --no-build-isolation --no-use-pep517 - -At this point you should be able to import pandas from your locally built version:: - - $ python - >>> import pandas - >>> print(pandas.__version__) # note: the exact output may differ - 1.5.0.dev0+1355.ge65a30e3eb.dirty - -This will create the new environment, and not touch any of your existing environments, -nor any existing Python installation. - -To return to your root environment:: - - mamba deactivate - -Option 1b: using pip -~~~~~~~~~~~~~~~~~~~~ +Option 2: using pip +~~~~~~~~~~~~~~~~~~~ -If you aren't using mamba for your development environment, follow these instructions. You'll need to have at least the :ref:`minimum Python version ` that pandas supports. You also need to have ``setuptools`` 51.0.0 or later to build pandas. @@ -150,10 +113,6 @@ You also need to have ``setuptools`` 51.0.0 or later to build pandas. # Install the build dependencies python -m pip install -r requirements-dev.txt - # Build and install pandas - python setup.py build_ext -j 4 - python -m pip install -e . --no-build-isolation --no-use-pep517 - **Unix**/**macOS with pyenv** Consult the docs for setting up pyenv `here `__. @@ -162,7 +121,6 @@ Consult the docs for setting up pyenv `here `__. # Create a virtual environment # Use an ENV_DIR of your choice. We'll use ~/Users//.pyenv/versions/pandas-dev - pyenv virtualenv # For instance: @@ -174,19 +132,15 @@ Consult the docs for setting up pyenv `here `__. # Now install the build dependencies in the cloned pandas repo python -m pip install -r requirements-dev.txt - # Build and install pandas - python setup.py build_ext -j 4 - python -m pip install -e . --no-build-isolation --no-use-pep517 - **Windows** Below is a brief overview on how to set-up a virtual environment with Powershell under Windows. For details please refer to the `official virtualenv user guide `__. -Use an ENV_DIR of your choice. We'll use ~\\virtualenvs\\pandas-dev where -'~' is the folder pointed to by either $env:USERPROFILE (Powershell) or -%USERPROFILE% (cmd.exe) environment variable. Any parent directories +Use an ENV_DIR of your choice. We'll use ``~\\virtualenvs\\pandas-dev`` where +``~`` is the folder pointed to by either ``$env:USERPROFILE`` (Powershell) or +``%USERPROFILE%`` (cmd.exe) environment variable. Any parent directories should already exist. .. code-block:: powershell @@ -200,16 +154,10 @@ should already exist. # Install the build dependencies python -m pip install -r requirements-dev.txt - # Build and install pandas - python setup.py build_ext -j 4 - python -m pip install -e . --no-build-isolation --no-use-pep517 - -Option 2: creating an environment using Docker ----------------------------------------------- +Option 3: using Docker +~~~~~~~~~~~~~~~~~~~~~~ -Instead of manually setting up a development environment, you can use `Docker -`_ to automatically create the environment with just several -commands. pandas provides a ``DockerFile`` in the root directory to build a Docker image +pandas provides a ``DockerFile`` in the root directory to build a Docker image with a full pandas development environment. **Docker Commands** @@ -226,13 +174,6 @@ Run Container:: # but if not alter ${PWD} to match your local repo path docker run -it --rm -v ${PWD}:/home/pandas pandas-dev -When inside the running container you can build and install pandas the same way as the other methods - -.. code-block:: bash - - python setup.py build_ext -j 4 - python -m pip install -e . --no-build-isolation --no-use-pep517 - *Even easier, you can integrate Docker with the following IDEs:* **Visual Studio Code** @@ -246,3 +187,26 @@ See https://code.visualstudio.com/docs/remote/containers for details. Enable Docker support and use the Services tool window to build and manage images as well as run and interact with containers. See https://www.jetbrains.com/help/pycharm/docker.html for details. + +Step 3: build and install pandas +-------------------------------- + +You can now run:: + + # Build and install pandas + python setup.py build_ext -j 4 + python -m pip install -e . --no-build-isolation --no-use-pep517 + +At this point you should be able to import pandas from your locally built version:: + + $ python + >>> import pandas + >>> print(pandas.__version__) # note: the exact output may differ + 2.0.0.dev0+880.g2b9e661fbb.dirty + +This will create the new environment, and not touch any of your existing environments, +nor any existing Python installation. + +.. note:: + You will need to repeat this step each time the C extensions change, for example + if you modified any file in ``pandas/_libs`` or if you did a fetch and merge from ``upstream/main``.