Skip to content

Latest commit

 

History

History
139 lines (96 loc) · 5.55 KB

pandas_setup.rst

File metadata and controls

139 lines (96 loc) · 5.55 KB

Before the sprint: Set up instructions

Make sure you bring your own laptop to the sprint.

You need the next software installed:

  • Git
  • An editor (vim, emacs, PyCharm,...). Make sure the editor is set up to use 4 spaces for tabs.

The pandas contributing guide contains detailed instructions on how to set up a pandas development environment. This document is a short summary with some additional information specific to the sprint.

Note

Next steps will download from the Internet around 900Mb for the pandas repository, and around 600Mb for Anaconda. If you don't have access to a fast Internet connection, contact your local chapter organizer, who will try to get a copy of both in a usb key.

Instructions

1. Create a GitHub account

If you don't have a GitHub account yet, simply go to https://github.com/join, and provide your personal information (name, email...). Select the free plan.

2. Get the pandas source code

All the changes during the sprints need to be made to the latest development version of pandas. Do not make them to a version downloaded from the Internet via pip, conda or a zip.

To get the latest development version:

Note

Windows users: run the next commands in a Git Bash session in the directory where you want to download pandas source code (download Git for Windows if needed).

  • In the terminal of your computer, in the directory where you want the copy of pandas source code, run:

    git clone https://github.com/<your-github-username>/pandas

This will create a directory named pandas, containing the latest version of the source code. We will name this directory <pandas-dir> in the rest of this document.

Make sure you're in the root of the <pandas-dir> directory.

cd <pandas-dir>

Then, set the upstream remote, so you can fetch the updates from the pandas repository:

git remote add upstream https://github.com/pandas-dev/pandas
On how to fetch the latest updates from the pandas repository, follow the steps in Syncing a Fork:
git fetch upstream
git checkout master
git merge upstream/master

3. Set up a Python environment

Note

Windows users: run the next commands in the Anaconda Prompt (found in the Anaconda folder of the Start menu).

  • Activate conda by one of the next (or equivalent, if you know what you're doing):
    • If you chose to prepend Anaconda to your PATH during install adding it to your ~/.bashrc, just restart your terminal.
    • Otherwise, run export PATH="<path-to-anaconda>/bin:$PATH" in your terminal. Keep in mind that it will be active exclusively in the terminal you run this command.
  • Create a conda environment:
    conda env create -n pandas_dev -f <path-to-pandas-dir>/ci/environment-dev.yaml

Note

Windows users: If you're copy-pasting the path, replace all pasted \ characters with / for the command to work.

  • Activate the new conda environment:
    conda activate pandas_dev
  • Install pandas development dependencies:
    conda install -c defaults -c conda-forge --file=<path-to-pandas-dir>/ci/requirements-optional-conda.txt

4. Compile C code in pandas

Besides the Python .py files, pandas source code includes C/Cython files which need to be compiled in order to run the development version of pandas.

Note

Windows users: to compile pandas, you need to install Visual Studio 2017. You need Visual Studio Community 2017 (2.5GB download during installation) as a minimum. Visual Studio Code does not support the required Build Tools and will not work. Select the workload "Python development" and the option "Python native development tools" on the right side.

(Users of legacy Python 2.7 should install Microsoft Visual C++ Compiler for Python 2.7 instead).

After the installation, run the following commands in Anaconda Prompt.

To compile these files simply run:
cd <pandas-dir>
python setup.py build_ext --inplace

The process will take several minutes.

5. Create a branch and start coding

On the day of the sprint, you will get assigned one pandas function or method to work on. Once you know which, you need to create a git branch for your changes. This will be useful when you have finished your changes, and you want to submit a pull request, so they are included in pandas.

Note

Windows users: run the next commands with Git Bash started at the cloned pandas folder.

You can create a git branch running:
git checkout -b <new_branch_name>

The branch name should be descriptive of the feature you will work on. For example, if you will work on the docstring of the method head, you can name your branch docstring_head.

If during the sprint you work in more than one docstring, you will need a branch for each.

To check in which branch are you:
git branch
To change to another branch:
git checkout <branch_name>