forked from pandas-dev/pandas
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathv0.11.0.txt
190 lines (136 loc) · 5.79 KB
/
v0.11.0.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
.. _whatsnew_0110:
v0.11.0 (March ??, 2013)
------------------------
This is a major release from 0.10.1 and includes many new features and
enhancements along with a large number of bug fixes. There are also a number of
important API changes that long-time pandas users should pay close attention
to.
API changes
~~~~~~~~~~~
Numeric dtypes will propagate and can coexist in DataFrames. If a dtype is passed (either directly via the ``dtype`` keyword, a passed ``ndarray``, or a passed ``Series``, then it will be preserved in DataFrame operations. Furthermore, different numeric dtypes will **NOT** be combined. The following example will give you a taste.
Dtype Specification
~~~~~~~~~~~~~~~~~~~
.. ipython:: python
df1 = DataFrame(randn(8, 1), columns = ['A'], dtype = 'float32')
df1
df1.dtypes
df2 = DataFrame(dict( A = Series(randn(8),dtype='float16'), B = Series(randn(8)), C = Series(randn(8),dtype='uint8') ))
df2
df2.dtypes
# here you get some upcasting
df3 = df1.reindex_like(df2).fillna(value=0.0) + df2
df3
df3.dtypes
Dtype Conversion
~~~~~~~~~~~~~~~~
.. ipython:: python
# this is lower-common-denomicator upcasting (meaning you get the dtype which can accomodate all of the types)
df3.values.dtype
# conversion of dtypes
df3.astype('float32').dtypes
# mixed type conversions
df3['D'] = '1.'
df3['E'] = '1'
df3.convert_objects(convert_numeric=True).dtypes
# same, but specific dtype conversion
df3['D'] = df3['D'].astype('float16')
df3['E'] = df3['E'].astype('int32')
df3.dtypes
# forcing date coercion
s = Series([datetime(2001,1,1,0,0), 'foo', 1.0, 1,
Timestamp('20010104'), '20010105'],dtype='O')
s.convert_objects(convert_dates='coerce')
Dtype Gotchas
~~~~~~~~~~~~~
**Platform Gotchas**
Starting in 0.11.0, construction of DataFrame/Series will use default dtypes of ``int64`` and ``float64``,
*regardless of platform*. This is not an apparent change from earlier versions of pandas. If you specify
dtypes, they *WILL* be respected, however (GH2837_)
The following will all result in ``int64`` dtypes
.. ipython:: python
DataFrame([1,2],columns=['a']).dtypes
DataFrame({'a' : [1,2] }).dtypes
DataFrame({'a' : 1 }, index=range(2)).dtypes
Keep in mind that ``DataFrame(np.array([1,2]))`` **WILL** result in ``int32`` on 32-bit platforms!
**Upcasting Gotchas**
Performing indexing operations on integer type data can easily upcast the data.
The dtype of the input data will be preserved in cases where ``nans`` are not introduced (coming soon).
.. ipython:: python
dfi = df3.astype('int32')
dfi['D'] = dfi['D'].astype('int64')
dfi
dfi.dtypes
casted = dfi[dfi>0]
casted
casted.dtypes
While float dtypes are unchanged.
.. ipython:: python
df4 = df3.copy()
df4['A'] = df4['A'].astype('float32')
df4.dtypes
casted = df4[df4>0]
casted
casted.dtypes
Datetimes Conversion
~~~~~~~~~~~~~~~~~~~~
Datetime64[ns] columns in a DataFrame (or a Series) allow the use of ``np.nan`` to indicate a nan value,
in addition to the traditional ``NaT``, or not-a-time. This allows convenient nan setting in a generic way.
Furthermore ``datetime64[ns]`` columns are created by default, when passed datetimelike objects (*this change was introduced in 0.10.1*)
(GH2809_, GH2810_)
.. ipython:: python
df = DataFrame(randn(6,2),date_range('20010102',periods=6),columns=['A','B'])
df['timestamp'] = Timestamp('20010103')
df
# datetime64[ns] out of the box
df.get_dtype_counts()
# use the traditional nan, which is mapped to NaT internally
df.ix[2:4,['A','timestamp']] = np.nan
df
Astype conversion on ``datetime64[ns]`` to ``object``, implicity converts ``NaT`` to ``np.nan``
.. ipython:: python
import datetime
s = Series([datetime.datetime(2001, 1, 2, 0, 0) for i in range(3)])
s.dtype
s[1] = np.nan
s
s.dtype
s = s.astype('O')
s
s.dtype
New features
~~~~~~~~~~~~
**Enhancements**
- In ``HDFStore``, provide dotted attribute access to ``get`` from stores
(e.g. store.df == store['df'])
- ``Squeeze`` to possibly remove length 1 dimensions from an object.
.. ipython:: python
p = Panel(randn(3,4,4),items=['ItemA','ItemB','ItemC'],
major_axis=date_range('20010102',periods=4),
minor_axis=['A','B','C','D'])
p
p.reindex(items=['ItemA']).squeeze()
p.reindex(items=['ItemA'],minor=['B']).squeeze()
- In ``pd.io.data.Options``,
+ Fix bug when trying to fetch data for the current month when already
past expiry.
+ Now using lxml to scrape html instead of BeautifulSoup (lxml was faster).
+ New instance variables for calls and puts are automatically created
when a method that creates them is called. This works for current month
where the instance variables are simply ``calls`` and ``puts``. Also
works for future expiry months and save the instance variable as
``callsMMYY`` or ``putsMMYY``, where ``MMYY`` are, respectively, the
month and year of the option's expiry.
+ ``Options.get_near_stock_price`` now allows the user to specify the
month for which to get relevant options data.
+ ``Options.get_forward_data`` now has optional kwargs ``near`` and
``above_below``. This allows the user to specify if they would like to
only return forward looking data for options near the current stock
price. This just obtains the data from Options.get_near_stock_price
instead of Options.get_xxx_data().
**Bug Fixes**
See the `full release notes
<https://github.com/pydata/pandas/blob/master/RELEASE.rst>`__ or issue tracker
on GitHub for a complete list.
.. _GH2809: https://github.com/pydata/pandas/issues/2809
.. _GH2810: https://github.com/pydata/pandas/issues/2810
.. _GH2837: https://github.com/pydata/pandas/issues/2837