forked from pandas-dev/pandas
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathv0.18.1.txt
237 lines (125 loc) · 7.13 KB
/
v0.18.1.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
.. _whatsnew_0181:
v0.18.1 (April ??, 2016)
------------------------
This is a minor bug-fix release from 0.18.0 and includes a large number of
bug fixes along several new features, enhancements, and performance improvements.
We recommend that all users upgrade to this version.
Highlights include:
.. contents:: What's new in v0.18.1
:local:
:backlinks: none
.. _whatsnew_0181.new_features:
New features
~~~~~~~~~~~~
.. _whatsnew_0181.enhancements:
Enhancements
~~~~~~~~~~~~
.. _whatsnew_0181.partial_string_indexing:
Partial string indexing on ``DateTimeIndex`` when part of a ``MultiIndex``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Partial string indexing now matches on ``DateTimeIndex`` when part of a ``MultiIndex`` (:issue:`10331`)
.. ipython:: python
dft2 = pd.DataFrame(np.random.randn(20, 1),
columns=['A'],
index=pd.MultiIndex.from_product([pd.date_range('20130101',
periods=10,
freq='12H'),
['a', 'b']]))
dft2
dft2.loc['2013-01-05']
idx = pd.IndexSlice
dft2 = dft2.swaplevel(0, 1).sort_index()
dft2.loc[idx[:, '2013-01-05'], :]
.. _whatsnew_0181.other:
Other Enhancements
^^^^^^^^^^^^^^^^^^
- ``pd.read_csv()`` now supports opening ZIP files that contains a single CSV, via extension inference or explict ``compression='zip'`` (:issue:`12175`)
- ``pd.read_csv()`` now supports opening files using xz compression, via extension inference or explicit ``compression='xz'`` is specified; ``xz`` compressions is also supported by ``DataFrame.to_csv`` in the same way (:issue:`11852`)
- ``pd.read_msgpack()`` now always gives writeable ndarrays even when compression is used (:issue:`12359`).
.. _whatsnew_0181.api:
API changes
~~~~~~~~~~~
- ``.searchsorted()`` for ``Index`` and ``TimedeltaIndex`` now accept a ``sorter`` argument to maintain compatibility with numpy's ``searchsorted`` function (:issue:`12238`)
- ``Period`` and ``PeriodIndex`` now raises ``IncompatibleFrequency`` error which inherits ``ValueError`` rather than raw ``ValueError`` (:issue:`12615`)
- ``CParserError`` is now a ``ValueError`` instead of just an ``Exception`` (:issue:`12551`)
- ``pd.show_versions()`` now includes ``pandas_datareader`` version (:issue:`12740`)
- Using ``apply`` on resampling groupby operations (e.g. ``df.groupby(pd.TimeGrouper(freq='M', key='date')).apply(...)``) now has the same output types as similar ``apply``s on other groupby operations (e.g. ``df.groupby(pd.Grouper(key='color')).apply(...)``). (:issue:`11742`).
Previous behavior:
.. code-block:: python
In [1]: df = pd.DataFrame({'date': pd.to_datetime(['10/10/2000', '11/10/2000']), 'value': [10, 13]})
In [2]: df.groupby(pd.TimeGrouper(key='date', freq='M')).apply(lambda x: x.value.sum())
Out[2]:
...
TypeError: cannot concatenate a non-NDFrame object
In [3]: df.groupby(pd.TimeGrouper(key='date', freq='M')).apply(lambda x: x[['value']].sum())
Out[3]:
date
2000-10-31 value 10
2000-11-30 value 13
dtype: int64
In [3]: type(df.groupby(pd.TimeGrouper(key='date', freq='M')).apply(lambda x: x[['value']].sum()))
Out[3]: pandas.core.series.Series
In [4]: df.groupby(pd.Grouper(key='date')).apply(lambda x: x.value.sum())
Out[4]:
date
2000-10-10 10
2000-11-10 13
dtype: int64
In [5]: type(df.groupby(pd.Grouper(key='date')).apply(lambda x: x.value.sum()))
Out[5]: pandas.core.series.Series
In [6]: df.groupby(pd.Grouper(key='date')).apply(lambda x: x[['value']].sum())
Out[6]:
value
date
2000-10-10 10
2000-11-10 13
In [7]: type(df.groupby(pd.Grouper(key='date')).apply(lambda x: x[['value']].sum()))
Out[7]: pandas.core.frame.DataFrame
New Behavior:
.. code-block:: python
In [1]: df = pd.DataFrame({'date': pd.to_datetime(['10/10/2000', '11/10/2000']), 'value': [10, 13]})
In [2]: df.groupby(pd.TimeGrouper(key='date', freq='M')).apply(lambda x: x.value.sum())
Out[2]:
date
2000-10-31 10
2000-11-30 13
Freq: M, dtype: int64
In [3]: type(df.groupby(pd.TimeGrouper(key='date', freq='M')).apply(lambda x: x.value.sum()))
Out[3]: pandas.core.series.Series
In [4]: df.groupby(pd.TimeGrouper(key='date', freq='M')).apply(lambda x: x[['value']].sum())
Out[4]:
value
date
2000-10-31 10
2000-11-30 13
In [5]: type(df.groupby(pd.TimeGrouper(key='date', freq='M')).apply(lambda x: x[['value']].sum()))
Out[5]: pandas.core.frame.DataFrame
.. _whatsnew_0181.deprecations:
Deprecations
^^^^^^^^^^^^
- The method name ``Index.sym_diff()`` is deprecated and can be replaced by ``Index.symmetric_difference()`` (:issue:`12591`)
.. _whatsnew_0181.performance:
Performance Improvements
~~~~~~~~~~~~~~~~~~~~~~~~
.. _whatsnew_0181.bug_fixes:
Bug Fixes
~~~~~~~~~
- ``usecols`` parameter in ``pd.read_csv`` is now respected even when the lines of a CSV file are not even (:issue:`12203`)
- Bug in ``groupby.transform(..)`` when ``axis=1`` is specified with a non-monotonic ordered index (:issue:`12713`)
- Bug in ``Period`` and ``PeriodIndex`` creation raises ``KeyError`` if ``freq="Minute"`` is specified. Note that "Minute" freq is deprecated in v0.17.0, and recommended to use ``freq="T"`` instead (:issue:`11854`)
- Bug in printing data which contains ``Period`` with different ``freq`` raises ``ValueError`` (:issue:`12615`)
- Bug in numpy compatibility of ``np.round()`` on a ``Series`` (:issue:`12600`)
- Bug in ``Series`` construction with ``Categorical`` and ``dtype='category'`` is specified (:issue:`12574`)
- Bugs in concatenation with a coercable dtype was too aggressive. (:issue:`12411`, :issue:`12045`, :issue:`11594`, :issue:`10571`)
- Bug in ``float_format`` option with option not being validated as a callable. (:issue:`12706`)
- Bug in ``Timestamp.__repr__`` that caused ``pprint`` to fail in nested structures (:issue:`12622`)
- Bug in ``value_counts`` when ``normalize=True`` and ``dropna=True`` where nulls still contributed to the normalized count (:issue:`12558`)
- Bug in ``Panel.fillna()`` ignoring ``inplace=True`` (:issue:`12633`)
- Bug in ``Series.rename``, ``DataFrame.rename`` and ``DataFrame.rename_axis`` not treating ``Series`` as mappings to relabel (:issue:`12623`).
- Clean in ``.rolling.min`` and ``.rolling.max`` to enhance dtype handling (:issue:`12373`)
- Bug in ``CategoricalIndex.get_loc`` returns different result from regular ``Index`` (:issue:`12531`)
- Bug in ``SparseSeries.shape`` ignores ``fill_value`` (:issue:`10452`)
- Bug in ``concat`` raises ``AttributeError`` when input data contains tz-aware datetime and timedelta (:issue:`12620`)
- Bug in ``pivot_table`` when ``margins=True`` and ``dropna=True`` where nulls still contributed to margin count (:issue:`12577`)
- Bug in ``Series.name`` when ``name`` attribute can be a hashable type (:issue:`12610`)
- Bug in ``.describe()`` resets categorical columns information (:issue:`11558`)