forked from pandas-dev/pandas
-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathv0.13.1.txt
175 lines (114 loc) · 5.11 KB
/
v0.13.1.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
.. _whatsnew_0131:
v0.13.1 (February ???)
----------------------
This is a minor release from 0.13.0 and includes a number of API changes, several new features and
enhancements along with a large number of bug fixes.
Highlights include:
Several experimental features are added, including:
There are several new or updated docs sections including:
- :ref:`Tutorials<tutorials>`, a guide to community developed pandas tutorials.
- :ref:`Pandas Ecosystem<ecosystem>`, a guide to complementary projects built on top of pandas.
.. warning::
0.13.1 fixes a bug that was caused by a combination of having numpy < 1.8, and doing
chained assignent on a string-like array. Please review :ref:`the docs<indexing.view_versus_copy>`,
chained indexing can have unexpected results and should generally be avoided.
This would previously segfault:
.. ipython:: python
df = DataFrame(dict(A = np.array(['foo','bar','bah','foo','bar'])))
df['A'].iloc[0] = np.nan
df
The recommended way to do this type of assignment is:
.. ipython:: python
df = DataFrame(dict(A = np.array(['foo','bar','bah','foo','bar'])))
df.ix[0,'A'] = np.nan
df
API changes
~~~~~~~~~~~
- Add ``-NaN`` and ``-nan`` to the default set of NA values (:issue:`5952`).
See :ref:`NA Values <io.na_values>`.
- Added the ``NDFrame.equals()`` method to compare if two NDFrames are
equal have equal axes, dtypes, and values. Added the
``array_equivalent`` function to compare if two ndarrays are
equal. NaNs in identical locations are treated as
equal. (:issue:`5283`) See also :ref:`the docs<basics.equals>` for a motivating example.
.. ipython:: python
df = DataFrame({'col':['foo', 0, np.nan]}).sort()
df2 = DataFrame({'col':[np.nan, 0, 'foo']}, index=[2,1,0])
df.equals(df)
import pandas.core.common as com
com.array_equivalent(np.array([0, np.nan]), np.array([0, np.nan]))
np.array_equal(np.array([0, np.nan]), np.array([0, np.nan]))
Prior Version Deprecations/Changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These were announced changes in 0.13 or prior that are taking effect as of 0.13.1
Deprecations
~~~~~~~~~~~~
Enhancements
~~~~~~~~~~~~
- pd.to_csv and pd.to_datetime learned a new ``infer_datetime_format`` keyword which greatly
improves parsing perf in many cases. Thanks to @lexual for suggesting and @danbirken
for rapidly implementing. (:issue:`5490`, :issue:`6021`)
- The ``ArrayFormatter`` for ``datetime`` and ``timedelta64`` now intelligently
limit precision based on the values in the array (:issue:`3401`)
Previously output might look like:
.. code-block:: python
age today diff
0 2001-01-01 00:00:00 2013-04-19 00:00:00 4491 days, 00:00:00
1 2004-06-01 00:00:00 2013-04-19 00:00:00 3244 days, 00:00:00
Now the output looks like:
.. ipython:: python
df = DataFrame([ Timestamp('20010101'),
Timestamp('20040601') ], columns=['age'])
df['today'] = Timestamp('20130419')
df['diff'] = df['today']-df['age']
df
- ``Panel.apply`` will work on non-ufuncs. See :ref:`the docs<basics.apply_panel>`.
.. ipython:: python
import pandas.util.testing as tm
panel = tm.makePanel(5)
panel
panel['ItemA']
Specifying an ``apply`` that operates on a Series (to return a single element)
.. ipython:: python
panel.apply(lambda x: x.dtype, axis='items')
A similar reduction type operation
.. ipython:: python
panel.apply(lambda x: x.sum(), axis='major_axis')
This is equivalent to
.. ipython:: python
panel.sum('major_axis')
A transformation operation that returns a Panel, but is computing
the z-score across the major_axis
.. ipython:: python
result = panel.apply(lambda x: (x-x.mean())/x.std(), axis='major_axis')
result
result['ItemA']
- ``Panel.apply`` operating on cross-sectional slabs. (:issue:`1148`)
.. ipython:: python
f = lambda x: ((x.T-x.mean(1))/x.std(1)).T
result = panel.apply(f, axis = ['items','major_axis'])
result
result.loc[:,:,'ItemA']
This is equivalent to the following
.. ipython:: python
result = Panel(dict([ (ax,f(panel.loc[:,:,ax])) for ax in panel.minor_axis ]))
result
result.loc[:,:,'ItemA']
- Added optional ``infer_datetime_format`` to ``read_csv``, ``Series.from_csv``
and ``DataFrame.read_csv`` (:issue:`5490`)
If ``parse_dates`` is enabled and this flag is set, pandas will attempt to
infer the format of the datetime strings in the columns, and if it can
be inferred, switch to a faster method of parsing them. In some cases
this can increase the parsing speed by ~5-10x.
.. code-block:: python
# Try to infer the format for the index column
df = pd.read_csv('foo.csv', index_col=0, parse_dates=True,
infer_datetime_format=True)
Experimental
~~~~~~~~~~~~
Bug Fixes
~~~~~~~~~
See :ref:`V0.13.1 Bug Fixes<release.bug_fixes-0.13.1>` for an extensive list of bugs that have been fixed in 0.13.1.
See the :ref:`full release notes
<release>` or issue tracker
on GitHub for a complete list of all API changes, Enhancements and Bug Fixes.