Skip to content

Commit 32d65dd

Browse files
committed
DOC: more docs loose ends
1 parent 6e418a9 commit 32d65dd

File tree

7 files changed

+114
-30
lines changed

7 files changed

+114
-30
lines changed

TODO.rst

+15-16
Original file line numberDiff line numberDiff line change
@@ -15,25 +15,25 @@ TODO docs
1515
- auto-sniff delimiter
1616
- MultiIndex
1717
- generally more documentation
18-
19-
- pivot_table
20-
18+
- DONE pivot_table
2119
- DONE Set mixed-type values with .ix
22-
- get_dtype_counts / dtypes
23-
- save / load functions
24-
- combine_first
25-
- describe for Series
26-
- DataFrame.to_string
27-
- Index / MultiIndex names
28-
- Unstack / stack by level name
29-
- ignore_index in DataFrame.append
20+
- DONE get_dtype_counts / dtypes
21+
- DONE save / load functions
22+
- DONE isnull/notnull as instance methods
23+
- DONE DataFrame.to_string
24+
- DONE IPython tab complete hook
25+
- DONE ignore_index in DataFrame.append
26+
- DONE describe for Series with dtype=object
27+
- DONE as_index=False in groupby
28+
- DONOTWANT is_monotonic
29+
- DONE DataFrame.to_csv: different delimiters
3030
- Inner join on key
3131
- Multi-key joining
32-
- as_index=False in groupby
33-
- is_monotonic
34-
- isnull/notnull as instance methods
32+
- Index / MultiIndex names
33+
34+
- combine_first
35+
- Unstack / stack by level name
3536
- name attribute on Series
36-
- DataFrame.to_csv: different delimiters?
3737
- groupby with level name
3838
- MultiIndex
3939
- get_level_values
@@ -43,7 +43,6 @@ TODO docs
4343
- df[col_list]
4444
- Panel.rename_axis
4545
- & and | for intersection / union
46-
- IPython tab complete hook
4746

4847
Performance blog
4948
----------------

doc/source/basics.rst

+30-7
Original file line numberDiff line numberDiff line change
@@ -242,9 +242,9 @@ will exclude NAs on Series input by default:
242242
Summarizing data: describe
243243
~~~~~~~~~~~~~~~~~~~~~~~~~~
244244

245-
For floating point data, there is a convenient ``describe`` function which
246-
computes a variety of summary statistics about a Series or the columns of a
247-
DataFrame (excluding NAs of course):
245+
There is a convenient ``describe`` function which computes a variety of summary
246+
statistics about a Series or the columns of a DataFrame (excluding NAs of
247+
course):
248248

249249
.. ipython:: python
250250
@@ -255,6 +255,16 @@ DataFrame (excluding NAs of course):
255255
frame.ix[::2] = np.nan
256256
frame.describe()
257257
258+
For a non-numerical Series object, `describe` will give a simple summary of the
259+
number of unique values and most frequently occurring values:
260+
261+
262+
.. ipython:: python
263+
264+
s = Series(['a', 'a', 'b', 'b', 'a', 'a', np.nan, 'c', 'd', 'a'])
265+
s.describe()
266+
267+
258268
Correlations between objects
259269
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
260270

@@ -657,15 +667,28 @@ alternately passing the ``dtype`` keyword argument to the object constructor.
657667
Pickling and serialization
658668
--------------------------
659669

660-
All pandas objects are equipped with ``save`` and ``load`` methods which use
661-
Python's ``cPickle`` module to save and load data structures to disk using the
662-
pickle format.
670+
All pandas objects are equipped with ``save`` methods which use Python's
671+
``cPickle`` module to save data structures to disk using the pickle format.
663672

664673
.. ipython:: python
665674
666675
df
667676
df.save('foo.pickle')
668-
DataFrame.load('foo.pickle')
677+
678+
The ``load`` function in the ``pandas`` namespace can be used to load any
679+
pickled pandas object (or any other pickled object) from file:
680+
681+
682+
.. ipython:: python
683+
684+
load('foo.pickle')
685+
686+
There is also a ``save`` function which takes any object as its first argument:
687+
688+
.. ipython:: python
689+
690+
save(df, 'foo.pickle')
691+
load('foo.pickle')
669692
670693
.. ipython:: python
671694
:suppress:

doc/source/dsintro.rst

+42-3
Original file line numberDiff line numberDiff line change
@@ -439,12 +439,51 @@ R package):
439439
baseball = read_csv('data/baseball.csv')
440440
baseball
441441
442-
However, using ``to_string`` will display any DataFrame in tabular form, though
443-
it won't always fit the console width:
442+
However, using ``to_string`` will return a string representation of the
443+
DataFrame in tabular form, though it won't always fit the console width:
444444

445445
.. ipython:: python
446446
447-
baseball.ix[-20:, :12].to_string()
447+
print baseball.ix[-20:, :12].to_string()
448+
449+
DataFrame column types
450+
~~~~~~~~~~~~~~~~~~~~~~
451+
452+
The four main types stored in pandas objects are float, int, boolean, and
453+
object. A convenient ``dtypes`` attribute return a Series with the data type of
454+
each column:
455+
456+
.. ipython:: python
457+
458+
baseball.dtypes
459+
460+
The related method ``get_dtype_counts`` will return the number of columns of
461+
each type:
462+
463+
.. ipython:: python
464+
465+
baseball.get_dtype_counts()
466+
467+
DataFrame column attribute access and IPython completion
468+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
469+
470+
If a DataFrame column label is a valid Python variable name, the column can be
471+
accessed like attributes:
472+
473+
.. ipython:: python
474+
475+
df = DataFrame({'foo1' : np.random.randn(5),
476+
'foo2' : np.random.randn(5)})
477+
df
478+
df.foo1
479+
480+
The columns are also connected to the `IPython <http://ipython.org>`__
481+
completion mechanism so they can be tab-completed:
482+
483+
.. code-block:: ipython
484+
485+
In [5]: df.fo<TAB>
486+
df.foo1 df.foo2
448487
449488
.. _basics.panel:
450489

doc/source/groupby.rst

+2
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,8 @@ changed by using the ``as_index`` option:
250250
grouped = df.groupby(['A', 'B'], as_index=False)
251251
grouped.aggregate(np.sum)
252252
253+
df.groupby('A', as_index=False).sum()
254+
253255
Note that you could use the ``delevel`` DataFrame function to achieve the same
254256
result as the column names are stored in the resulting ``MultiIndex``:
255257

doc/source/io.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@ rather than reading the entire file into memory, such as the following:
175175
.. ipython:: python
176176
:suppress:
177177
178-
df[:7].to_csv('tmp.sv', delimiter='|')
178+
df[:7].to_csv('tmp.sv', sep='|')
179179
180180
.. ipython:: python
181181

doc/source/merging.rst

+23-2
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@
1414
Merging / Joining data sets
1515
***************************
1616

17-
Appending disjoint objects
18-
--------------------------
17+
Appending DataFrame objects
18+
---------------------------
1919

2020
Series and DataFrame have an ``append`` method which will glue together objects
2121
each of whose ``index`` (Series labels or DataFrame rows) is mutually
@@ -40,6 +40,27 @@ In the case of DataFrame, the indexes must be disjoint but the columns do not ne
4040
df2
4141
df1.append(df2)
4242
43+
Appending record-array like DataFrames
44+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45+
46+
For DataFrames which don't have a meaningful index, you may wish to append them
47+
and ignore the fact that they may have overlapping indexes:
48+
49+
.. ipython:: python
50+
51+
df1 = DataFrame(randn(6, 4), columns=['A', 'B', 'C', 'D'])
52+
df2 = DataFrame(randn(3, 4), columns=['A', 'B', 'C', 'D'])
53+
54+
df1
55+
df2
56+
57+
To do this, use the ``ignore_index`` argument:
58+
59+
.. ipython:: python
60+
61+
df1.append(df2, ignore_index=True)
62+
63+
4364
Joining / merging DataFrames
4465
----------------------------
4566

doc/source/reshaping.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ some very expressive and fast data manipulations.
195195
Pivot tables and cross-tabulations
196196
**********************************
197197

198-
The function `pandas.pivot_table` can be used to create spreadsheet-style pivot
198+
The function ``pandas.pivot_table`` can be used to create spreadsheet-style pivot
199199
tables. It takes a number of arguments
200200

201201
- ``data``: A DataFrame object

0 commit comments

Comments
 (0)