Skip to content

Latest commit

 

History

History
226 lines (154 loc) · 6.88 KB

cookbook.rst

File metadata and controls

226 lines (154 loc) · 6.88 KB
.. currentmodule:: pandas

.. ipython:: python
   :suppress:

   import numpy as np
   import random
   import os
   np.random.seed(123456)
   from pandas import *
   import pandas as pd
   randn = np.random.randn
   randint = np.random.randint
   np.set_printoptions(precision=4, suppress=True)

Cookbook

This is a respository for short and sweet examples and links for useful pandas recipes. We encourage users to add to this documentation.

This is a great First Pull Request (to add interesting links and/or put short code inline for existing links)

Selection

Boolean Rows Indexing

Using loc and iloc in selections

Extending a panel along the minor axis

Boolean masking in a panel

Selecting via the complement

MultiIndexing

Creating a multi-index from a labeled frame

Slicing

Slicing a multi-index with xs

Slicing a multi-index with xs #2

Sorting

Multi-index sorting

Partial Selection, the need for sortedness

Levels

Prepending a level to a multiindex

Flatten Hierarchical columns

Grouping

Basic grouping with apply

Using get_group

Apply to different items in a group

Expanding Apply

Replacing values with groupby means

Sort by group with aggregation

Create multiple aggregated columns

Expanding Data

Alignment and to-date

Rolling Computation window based on values instead of counts

Splitting

Splitting a frame

Timeseries

Between times

Vectorized Lookup

Resampling

TimeGrouping of values grouped across time

TimeGrouping #2

Resampling with custom periods

Resample intraday frame without adding new days

Resample minute data

Merge

emulate R rbind

Self Join

How to set the index and join

Plotting

Make Matplotlib look like R

Setting x-axis major and minor labels

Data In/Out

CSV

Reading a csv chunk-by-chunk

Reading the first few lines of a frame

Inferring dtypes from a file

SQL

Reading from databases with SQL

HDFStore

Simple Queries with a Timestamp Index

Managing heteregenous data using a linked multiple table hierarchy

Merging on-disk tables with millions of rows

Large Data work flows

Troubleshoot HDFStore exceptions

Storing Attributes to a group node

.. ipython:: python

    df = DataFrame(np.random.randn(8,3))
    store = HDFStore('test.h5')
    store.put('df',df)

    # you can store an arbitrary python object via pickle
    store.get_storer('df').attrs.my_attribute = dict(A = 10)
    store.get_storer('df').attrs.my_attribute

.. ipython:: python
   :suppress:

    store.close()
    os.remove('test.h5')

Miscellaneous

Operating with timedeltas