Skip to content

Latest commit

 

History

History
136 lines (95 loc) · 5.02 KB

pandas_setup.rst

File metadata and controls

136 lines (95 loc) · 5.02 KB

Before the sprint: Set up instructions

Make sure you bring your own laptop to the sprint.

You need the next software installed:

  • Git
  • An editor (vim, emacs, PyCharm,...). Make sure the editor is set up to use 4 spaces for tabs.

The pandas contributing guide contains detailed instructions on how to set up a pandas devlopment environemnt. This document is a short summary with some additional information specific to the sprint.

Note

Next steps will download from the Internet around 900Mb for the pandas repository, and around 600Mb for Anaconda. If you don't have access to a fast Internet connection, contact your local chapter organizer, who will try to get a copy of both in a usb key.

Instructions

1. Create a GitHub account

If you don't have a GitHub account yet, simply go to https://github.com/join, and provide your personal information (name, email...). Select the free plan.

2. Get the pandas source code

All the changes during the sprints need to be made to the latest development version of pandas. Do not make them to a version downloaded from the Internet via pip, conda or a zip.

To get the latest development version: * Fork the pandas repository on GitHub by click on the top-right Fork button

Note

For Window Users: download git for Windows <https://gitforwindows.org/> and run Git Bash in the directory where you want the copy of pandas source code with the following commends.

  • In the terminal of your computer, in the directory where you want the copy of pandas source code, run:

    git clone https://github.com/<your-github-username>/pandas

This will create a directory named pandas, containing the latest version of the source code. We will name this directory <pandas-dir> in the rest of this document.

Then, set the upstream remote, so you can fetch the updates from the pandas repository:

git remote add upstream https://github.com/pandas-dev/pandas

3. Set up a Python environment

3.a Python environment with Anaconda

Note

For Window users, go to the start menu, find Anaconda Prompt inside the Anaconda floder and run the above commends in Anaconda Prompt

  • Activate conda by one of the next (or equivalent, if you know what you're doing):
    • If you chose to prepend Anaconda to your PATH during install adding it to your ~/.bashrc, just restart your terminal.
    • Otherwise, run export PATH="<path-to-anaconda>/bin:$PATH" in your terminal. Keep in mind that it will be active exclusively in the terminal you run this command.
  • Create a conda environment:
    conda env create -n pandas_dev -f <path-to-pandas>/ci/environment-dev.yaml
  • Activate the new conda environment:
    source activate pandas_dev
  • Install pandas development dependencies:
    conda install -c defaults -c conda-forge --file=<pandas-dir>/ci/requirements-optional-conda.txt

3.b Python environment with virtualenv and pip

TODO

4. Compile C code in pandas

Besides the Python .py files, pandas source code includes C/Cython files which need to be compiled in order to run the development version of pandas.

Note

For Window Users, you'll need to install the compiler toolset:

for Python 3.6 - Install Visual Studio 2017, select the Python development workload and the Native development tools option <https://www.visualstudio.com/>

for Python 2.7 - Microsoft Visual C++ Compiler for Python 2.7 <https://www.microsoft.com/download/details.aspx?id=44266>

To compile these files simply run:
cd <pandas-dir>
python setup.py build_ext --inplace

The process will take several minutes.

(For Window users: run the above commends in Anaconda Prompt)

5. Create a branch and start coding

On the day of the sprint, you will get assigned one pandas function or method to work on. Once you know which, you need to create a git branch for your changes. This will be useful when you have finished your changes, and you want to submit a pull request, so they are included in pandas.

Note

for Window users run above comments with Git Bash at the colned pandas floder

You can create a git branch running:
git checkout -b <new_branch_name>

The branch name should be descriptive of the feature you will work on. For example, if you will work on the docstring of the method head, you can name your branch docstring_head.

If during the sprint you work in more than one docstring, you will need a branch for each.

To check in which branch are you:
git branch
To change to another branch:
git checkout <branch_name>