Skip to content

DOC: consistent imports (GH9886) part II #10136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 11 additions & 13 deletions doc/source/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,16 @@
:suppress:

import numpy as np
import random
import pandas as pd
import os
np.random.seed(123456)
from pandas import options
import pandas as pd
np.set_printoptions(precision=4, suppress=True)
import matplotlib
try:
matplotlib.style.use('ggplot')
except AttributeError:
options.display.mpl_style = 'default'
options.display.max_rows=15
pd.options.display.mpl_style = 'default'
pd.options.display.max_rows = 15

#### portions of this were borrowed from the
#### Pandas cheatsheet
Expand Down Expand Up @@ -298,7 +296,7 @@ Using the :func:`~Series.isin` method for filtering:
.. ipython:: python

df2 = df.copy()
df2['E']=['one', 'one','two','three','four','three']
df2['E'] = ['one', 'one','two','three','four','three']
df2
df2[df2['E'].isin(['two','four'])]

Expand All @@ -310,7 +308,7 @@ by the indexes

.. ipython:: python

s1 = pd.Series([1,2,3,4,5,6],index=pd.date_range('20130102',periods=6))
s1 = pd.Series([1,2,3,4,5,6], index=pd.date_range('20130102', periods=6))
s1
df['F'] = s1

Expand Down Expand Up @@ -359,7 +357,7 @@ returns a copy of the data.

.. ipython:: python

df1 = df.reindex(index=dates[0:4],columns=list(df.columns) + ['E'])
df1 = df.reindex(index=dates[0:4], columns=list(df.columns) + ['E'])
df1.loc[dates[0]:dates[1],'E'] = 1
df1

Expand Down Expand Up @@ -409,9 +407,9 @@ In addition, pandas automatically broadcasts along the specified dimension.

.. ipython:: python

s = pd.Series([1,3,5,np.nan,6,8],index=dates).shift(2)
s = pd.Series([1,3,5,np.nan,6,8], index=dates).shift(2)
s
df.sub(s,axis='index')
df.sub(s, axis='index')


Apply
Expand All @@ -431,7 +429,7 @@ See more at :ref:`Histogramming and Discretization <basics.discretization>`

.. ipython:: python

s = pd.Series(np.random.randint(0,7,size=10))
s = pd.Series(np.random.randint(0, 7, size=10))
s
s.value_counts()

Expand Down Expand Up @@ -516,9 +514,9 @@ See the :ref:`Grouping section <groupby>`
.. ipython:: python

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'foo', 'bar', 'foo', 'foo'],
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'two', 'two', 'one', 'three'],
'C' : np.random.randn(8),
'D' : np.random.randn(8)})
df
Expand Down
110 changes: 52 additions & 58 deletions doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,10 @@
:suppress:

import numpy as np
import random
np.random.seed(123456)
from pandas import *
options.display.max_rows=15
import pandas as pd
randn = np.random.randn
randint = np.random.randint
np.random.seed(123456)
np.set_printoptions(precision=4, suppress=True)
from pandas.compat import range, zip
pd.options.display.max_rows=15

******************************
MultiIndex / Advanced Indexing
Expand Down Expand Up @@ -80,10 +75,10 @@ demo different ways to initialize MultiIndexes.
tuples = list(zip(*arrays))
tuples

index = MultiIndex.from_tuples(tuples, names=['first', 'second'])
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
index

s = Series(randn(8), index=index)
s = pd.Series(np.random.randn(8), index=index)
s

When you want every pairing of the elements in two iterables, it can be easier
Expand All @@ -92,7 +87,7 @@ to use the ``MultiIndex.from_product`` function:
.. ipython:: python

iterables = [['bar', 'baz', 'foo', 'qux'], ['one', 'two']]
MultiIndex.from_product(iterables, names=['first', 'second'])
pd.MultiIndex.from_product(iterables, names=['first', 'second'])

As a convenience, you can pass a list of arrays directly into Series or
DataFrame to construct a MultiIndex automatically:
Expand All @@ -101,9 +96,9 @@ DataFrame to construct a MultiIndex automatically:

arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
s = Series(randn(8), index=arrays)
s = pd.Series(np.random.randn(8), index=arrays)
s
df = DataFrame(randn(8, 4), index=arrays)
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
df

All of the ``MultiIndex`` constructors accept a ``names`` argument which stores
Expand All @@ -119,9 +114,9 @@ of the index is up to you:

.. ipython:: python

df = DataFrame(randn(3, 8), index=['A', 'B', 'C'], columns=index)
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df
DataFrame(randn(6, 6), index=index[:6], columns=index[:6])
pd.DataFrame(np.random.randn(6, 6), index=index[:6], columns=index[:6])

We've "sparsified" the higher levels of the indexes to make the console output a
bit easier on the eyes.
Expand All @@ -131,7 +126,7 @@ tuples as atomic labels on an axis:

.. ipython:: python

Series(randn(8), index=tuples)
pd.Series(np.random.randn(8), index=tuples)

The reason that the ``MultiIndex`` matters is that it can allow you to do
grouping, selection, and reshaping operations as we will describe below and in
Expand Down Expand Up @@ -282,16 +277,16 @@ As usual, **both sides** of the slicers are included as this is label indexing.
def mklbl(prefix,n):
return ["%s%s" % (prefix,i) for i in range(n)]

miindex = MultiIndex.from_product([mklbl('A',4),
mklbl('B',2),
mklbl('C',4),
mklbl('D',2)])
micolumns = MultiIndex.from_tuples([('a','foo'),('a','bar'),
('b','foo'),('b','bah')],
names=['lvl0', 'lvl1'])
dfmi = DataFrame(np.arange(len(miindex)*len(micolumns)).reshape((len(miindex),len(micolumns))),
index=miindex,
columns=micolumns).sortlevel().sortlevel(axis=1)
miindex = pd.MultiIndex.from_product([mklbl('A',4),
mklbl('B',2),
mklbl('C',4),
mklbl('D',2)])
micolumns = pd.MultiIndex.from_tuples([('a','foo'),('a','bar'),
('b','foo'),('b','bah')],
names=['lvl0', 'lvl1'])
dfmi = pd.DataFrame(np.arange(len(miindex)*len(micolumns)).reshape((len(miindex),len(micolumns))),
index=miindex,
columns=micolumns).sortlevel().sortlevel(axis=1)
dfmi

Basic multi-index slicing using slices, lists, and labels.
Expand Down Expand Up @@ -418,9 +413,9 @@ instance:

.. ipython:: python

midx = MultiIndex(levels=[['zero', 'one'], ['x','y']],
labels=[[1,1,0,0],[1,0,1,0]])
df = DataFrame(randn(4,2), index=midx)
midx = pd.MultiIndex(levels=[['zero', 'one'], ['x','y']],
labels=[[1,1,0,0],[1,0,1,0]])
df = pd.DataFrame(np.random.randn(4,2), index=midx)
df
df2 = df.mean(level=0)
df2
Expand Down Expand Up @@ -471,7 +466,7 @@ labels will be sorted lexicographically!
.. ipython:: python

import random; random.shuffle(tuples)
s = Series(randn(8), index=MultiIndex.from_tuples(tuples))
s = pd.Series(np.random.randn(8), index=pd.MultiIndex.from_tuples(tuples))
s
s.sortlevel(0)
s.sortlevel(1)
Expand Down Expand Up @@ -509,13 +504,13 @@ an exception. Here is a concrete example to illustrate this:
.. ipython:: python

tuples = [('a', 'a'), ('a', 'b'), ('b', 'a'), ('b', 'b')]
idx = MultiIndex.from_tuples(tuples)
idx = pd.MultiIndex.from_tuples(tuples)
idx.lexsort_depth

reordered = idx[[1, 0, 3, 2]]
reordered.lexsort_depth

s = Series(randn(4), index=reordered)
s = pd.Series(np.random.randn(4), index=reordered)
s.ix['a':'a']

However:
Expand All @@ -540,15 +535,15 @@ index positions. ``take`` will also accept negative integers as relative positio

.. ipython:: python

index = Index(randint(0, 1000, 10))
index = pd.Index(np.random.randint(0, 1000, 10))
index

positions = [0, 9, 3]

index[positions]
index.take(positions)

ser = Series(randn(10))
ser = pd.Series(np.random.randn(10))

ser.iloc[positions]
ser.take(positions)
Expand All @@ -558,7 +553,7 @@ row or column positions.

.. ipython:: python

frm = DataFrame(randn(5, 3))
frm = pd.DataFrame(np.random.randn(5, 3))

frm.take([1, 4, 3])

Expand All @@ -569,11 +564,11 @@ intended to work on boolean indices and may return unexpected results.

.. ipython:: python

arr = randn(10)
arr = np.random.randn(10)
arr.take([False, False, True, True])
arr[[0, 1]]

ser = Series(randn(10))
ser = pd.Series(np.random.randn(10))
ser.take([False, False, True, True])
ser.ix[[0, 1]]

Expand All @@ -583,14 +578,14 @@ faster than fancy indexing.

.. ipython::

arr = randn(10000, 5)
arr = np.random.randn(10000, 5)
indexer = np.arange(10000)
random.shuffle(indexer)

timeit arr[indexer]
timeit arr.take(indexer, axis=0)

ser = Series(arr[:, 0])
ser = pd.Series(arr[:, 0])
timeit ser.ix[indexer]
timeit ser.take(indexer)

Expand All @@ -608,10 +603,9 @@ setting the index of a ``DataFrame/Series`` with a ``category`` dtype would conv

.. ipython:: python

df = DataFrame({'A' : np.arange(6),
'B' : Series(list('aabbca')).astype('category',
categories=list('cab'))
})
df = pd.DataFrame({'A': np.arange(6),
'B': list('aabbca')})
df['B'] = df['B'].astype('category', categories=list('cab'))
df
df.dtypes
df.B.cat.categories
Expand Down Expand Up @@ -669,18 +663,18 @@ values NOT in the categories, similarly to how you can reindex ANY pandas index.

.. code-block:: python

In [10]: df3 = DataFrame({'A' : np.arange(6),
'B' : Series(list('aabbca')).astype('category',
categories=list('abc'))
}).set_index('B')
In [9]: df3 = pd.DataFrame({'A' : np.arange(6),
'B' : pd.Series(list('aabbca')).astype('category')})

In [11]: df3 = df3.set_index('B')

In [11]: df3.index
Out[11]:
CategoricalIndex([u'a', u'a', u'b', u'b', u'c', u'a'],
categories=[u'a', u'b', u'c'],
ordered=False)

In [12]: pd.concat([df2,df3]
In [12]: pd.concat([df2, df3]
TypeError: categories must match existing categories when appending

.. _indexing.float64index:
Expand All @@ -705,9 +699,9 @@ same.

.. ipython:: python

indexf = Index([1.5, 2, 3, 4.5, 5])
indexf = pd.Index([1.5, 2, 3, 4.5, 5])
indexf
sf = Series(range(5),index=indexf)
sf = pd.Series(range(5), index=indexf)
sf

Scalar selection for ``[],.ix,.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
Expand Down Expand Up @@ -749,17 +743,17 @@ In non-float indexes, slicing using floats will raise a ``TypeError``

.. code-block:: python

In [1]: Series(range(5))[3.5]
In [1]: pd.Series(range(5))[3.5]
TypeError: the label [3.5] is not a proper indexer for this index type (Int64Index)

In [1]: Series(range(5))[3.5:4.5]
In [1]: pd.Series(range(5))[3.5:4.5]
TypeError: the slice start [3.5] is not a proper indexer for this index type (Int64Index)

Using a scalar float indexer will be deprecated in a future version, but is allowed for now.

.. code-block:: python

In [3]: Series(range(5))[3.0]
In [3]: pd.Series(range(5))[3.0]
Out[3]: 3

Here is a typical use-case for using this type of indexing. Imagine that you have a somewhat
Expand All @@ -768,12 +762,12 @@ example be millisecond offsets.

.. ipython:: python

dfir = concat([DataFrame(randn(5,2),
index=np.arange(5) * 250.0,
columns=list('AB')),
DataFrame(randn(6,2),
index=np.arange(4,10) * 250.1,
columns=list('AB'))])
dfir = pd.concat([pd.DataFrame(np.random.randn(5,2),
index=np.arange(5) * 250.0,
columns=list('AB')),
pd.DataFrame(np.random.randn(6,2),
index=np.arange(4,10) * 250.1,
columns=list('AB'))])
dfir

Selection operations then will always work on a value basis, for all selection operators.
Expand Down
Loading