@@ -92,7 +92,7 @@ Accelerated operations
92
92
----------------------
93
93
94
94
Pandas has support for accelerating certain types of binary numerical and boolean operations using
95
- the ``numexpr `` library (starting in 0.11.0) and the ``bottleneck `` libraries.
95
+ the ``numexpr `` library (starting in 0.11.0) and the ``bottleneck `` libraries.
96
96
97
97
These libraries are especially useful when dealing with large data sets, and provide large
98
98
speedups. ``numexpr `` uses smart chunking, caching, and multiple cores. ``bottleneck `` is
@@ -110,7 +110,7 @@ Here is a sample (using 100 column x 100,000 row ``DataFrames``):
110
110
``df1 * df2 ``; 21.71; 36.63; 0.5928
111
111
``df1 + df2 ``; 22.04; 36.50; 0.6039
112
112
113
- You are highly encouraged to install both libraries. See the section
113
+ You are highly encouraged to install both libraries. See the section
114
114
:ref: `Recommended Dependencies <install.recommended_dependencies >` for more installation info.
115
115
116
116
.. _basics.binop :
@@ -1011,16 +1011,16 @@ dtypes
1011
1011
------
1012
1012
1013
1013
The main types stored in pandas objects are ``float ``, ``int ``, ``bool ``, ``datetime64[ns] ``, ``timedelta[ns] ``,
1014
- and ``object ``. In addition these dtypes have item sizes, e.g. ``int64 `` and ``int32 ``. A convenient ``dtypes ``
1014
+ and ``object ``. In addition these dtypes have item sizes, e.g. ``int64 `` and ``int32 ``. A convenient ``dtypes ``
1015
1015
attribute for DataFrames returns a Series with the data type of each column.
1016
1016
1017
1017
.. ipython :: python
1018
1018
1019
- dft = DataFrame(dict ( A = np.random.rand(3 ),
1020
- B = 1 ,
1021
- C = ' foo' ,
1022
- D = Timestamp(' 20010102' ),
1023
- E = Series([1.0 ]* 3 ).astype(' float32' ),
1019
+ dft = DataFrame(dict ( A = np.random.rand(3 ),
1020
+ B = 1 ,
1021
+ C = ' foo' ,
1022
+ D = Timestamp(' 20010102' ),
1023
+ E = Series([1.0 ]* 3 ).astype(' float32' ),
1024
1024
F = False ,
1025
1025
G = Series([1 ]* 3 ,dtype = ' int8' )))
1026
1026
dft
@@ -1032,7 +1032,7 @@ On a ``Series`` use the ``dtype`` method.
1032
1032
1033
1033
dft[' A' ].dtype
1034
1034
1035
- If a pandas object contains data multiple dtypes *IN A SINGLE COLUMN *, the dtype of the
1035
+ If a pandas object contains data multiple dtypes *IN A SINGLE COLUMN *, the dtype of the
1036
1036
column will be chosen to accommodate all of the data types (``object `` is the most
1037
1037
general).
1038
1038
@@ -1051,26 +1051,26 @@ each type in a ``DataFrame``:
1051
1051
1052
1052
dft.get_dtype_counts()
1053
1053
1054
- Numeric dtypes will propagate and can coexist in DataFrames (starting in v0.11.0).
1055
- If a dtype is passed (either directly via the ``dtype `` keyword, a passed ``ndarray ``,
1056
- or a passed ``Series ``, then it will be preserved in DataFrame operations. Furthermore,
1054
+ Numeric dtypes will propagate and can coexist in DataFrames (starting in v0.11.0).
1055
+ If a dtype is passed (either directly via the ``dtype `` keyword, a passed ``ndarray ``,
1056
+ or a passed ``Series ``, then it will be preserved in DataFrame operations. Furthermore,
1057
1057
different numeric dtypes will **NOT ** be combined. The following example will give you a taste.
1058
1058
1059
1059
.. ipython :: python
1060
1060
1061
1061
df1 = DataFrame(randn(8 , 1 ), columns = [' A' ], dtype = ' float32' )
1062
1062
df1
1063
1063
df1.dtypes
1064
- df2 = DataFrame(dict ( A = Series(randn(8 ),dtype = ' float16' ),
1065
- B = Series(randn(8 )),
1064
+ df2 = DataFrame(dict ( A = Series(randn(8 ),dtype = ' float16' ),
1065
+ B = Series(randn(8 )),
1066
1066
C = Series(np.array(randn(8 ),dtype = ' uint8' )) ))
1067
1067
df2
1068
1068
df2.dtypes
1069
1069
1070
1070
defaults
1071
1071
~~~~~~~~
1072
1072
1073
- By default integer types are ``int64 `` and float types are ``float64 ``,
1073
+ By default integer types are ``int64 `` and float types are ``float64 ``,
1074
1074
*REGARDLESS * of platform (32-bit or 64-bit). The following will all result in ``int64 `` dtypes.
1075
1075
1076
1076
.. ipython :: python
@@ -1090,7 +1090,7 @@ The following **WILL** result in ``int32`` on 32-bit platform.
1090
1090
upcasting
1091
1091
~~~~~~~~~
1092
1092
1093
- Types can potentially be *upcasted * when combined with other types, meaning they are promoted
1093
+ Types can potentially be *upcasted * when combined with other types, meaning they are promoted
1094
1094
from the current type (say ``int `` to ``float ``)
1095
1095
1096
1096
.. ipython :: python
@@ -1099,8 +1099,8 @@ from the current type (say ``int`` to ``float``)
1099
1099
df3
1100
1100
df3.dtypes
1101
1101
1102
- The ``values `` attribute on a DataFrame return the *lower-common-denominator * of the dtypes, meaning
1103
- the dtype that can accomodate **ALL ** of the types in the resulting homogenous dtyped numpy array. This can
1102
+ The ``values `` attribute on a DataFrame return the *lower-common-denominator * of the dtypes, meaning
1103
+ the dtype that can accomodate **ALL ** of the types in the resulting homogenous dtyped numpy array. This can
1104
1104
force some *upcasting *.
1105
1105
1106
1106
.. ipython :: python
@@ -1116,7 +1116,7 @@ You can use the ``astype`` method to explicity convert dtypes from one to anothe
1116
1116
even if the dtype was unchanged (pass ``copy=False `` to change this behavior). In addition, they will raise an
1117
1117
exception if the astype operation is invalid.
1118
1118
1119
- Upcasting is always according to the **numpy ** rules. If two different dtypes are involved in an operation,
1119
+ Upcasting is always according to the **numpy ** rules. If two different dtypes are involved in an operation,
1120
1120
then the more *general * one will be used as the result of the operation.
1121
1121
1122
1122
.. ipython :: python
@@ -1132,7 +1132,7 @@ object conversion
1132
1132
1133
1133
``convert_objects `` is a method to try to force conversion of types from the ``object `` dtype to other types.
1134
1134
To force conversion of specific types that are *number like *, e.g. could be a string that represents a number,
1135
- pass ``convert_numeric=True ``. This will force strings and numbers alike to be numbers if possible, otherwise
1135
+ pass ``convert_numeric=True ``. This will force strings and numbers alike to be numbers if possible, otherwise
1136
1136
they will be set to ``np.nan ``.
1137
1137
1138
1138
.. ipython :: python
@@ -1146,20 +1146,20 @@ they will be set to ``np.nan``.
1146
1146
df3[' E' ] = df3[' E' ].astype(' int32' )
1147
1147
df3.dtypes
1148
1148
1149
- To force conversion to ``datetime64[ns] ``, pass ``convert_dates='coerce' ``.
1149
+ To force conversion to ``datetime64[ns] ``, pass ``convert_dates='coerce' ``.
1150
1150
This will convert any datetimelike object to dates, forcing other values to ``NaT ``.
1151
1151
This might be useful if you are reading in data which is mostly dates,
1152
1152
but occasionally has non-dates intermixed and you want to represent as missing.
1153
1153
1154
1154
.. ipython :: python
1155
1155
1156
- s = Series([datetime(2001 ,1 ,1 ,0 ,0 ),
1157
- ' foo' , 1.0 , 1 , Timestamp(' 20010104' ),
1156
+ s = Series([datetime(2001 ,1 ,1 ,0 ,0 ),
1157
+ ' foo' , 1.0 , 1 , Timestamp(' 20010104' ),
1158
1158
' 20010105' ],dtype = ' O' )
1159
1159
s
1160
1160
s.convert_objects(convert_dates = ' coerce' )
1161
1161
1162
- In addition, ``convert_objects `` will attempt the *soft * conversion of any *object * dtypes, meaning that if all
1162
+ In addition, ``convert_objects `` will attempt the *soft * conversion of any *object * dtypes, meaning that if all
1163
1163
the objects in a Series are of the same type, the Series will have that dtype.
1164
1164
1165
1165
gotchas
@@ -1230,10 +1230,12 @@ Working with package options
1230
1230
----------------------------
1231
1231
1232
1232
.. _basics.working_with_options :
1233
+ .. versionadded :: 0.10.1
1233
1234
1234
- Introduced in 0.10.0, pandas supports a new system for working with options.
1235
- Options have a full "dotted-style", case-insensitive name (e.g. `` display.max_rows ``),
1235
+ Pandas has an options system that let's you customize some aspects of it's behaviour,
1236
+ display-related options being those the user is must likely to adjust.
1236
1237
1238
+ Options have a full "dotted-style", case-insensitive name (e.g. ``display.max_rows ``),
1237
1239
You can get/set options directly as attributes of the top-level ``options `` attribute:
1238
1240
1239
1241
.. ipython :: python
0 commit comments