forked from pandas-dev/pandas
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathv0.15.2.txt
159 lines (103 loc) · 6.27 KB
/
v0.15.2.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
.. _whatsnew_0152:
v0.15.2 (December ??, 2014)
---------------------------
This is a minor release from 0.15.1 and includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
users upgrade to this version.
- Highlights include:
- :ref:`Enhancements <whatsnew_0152.enhancements>`
- :ref:`API Changes <whatsnew_0152.api>`
- :ref:`Performance Improvements <whatsnew_0152.performance>`
- :ref:`Experimental Changes <whatsnew_0152.experimental>`
- :ref:`Bug Fixes <whatsnew_0152.bug_fixes>`
.. _whatsnew_0152.api:
API changes
~~~~~~~~~~~
- Indexing in ``MultiIndex`` beyond lex-sort depth is now supported, though
a lexically sorted index will have a better performance. (:issue:`2646`)
.. ipython:: python
df = pd.DataFrame({'jim':[0, 0, 1, 1],
'joe':['x', 'x', 'z', 'y'],
'jolie':np.random.rand(4)}).set_index(['jim', 'joe'])
df
df.index.lexsort_depth
# in prior versions this would raise a KeyError
# will now show a PerformanceWarning
df.loc[(1, 'z')]
# lexically sorting
df2 = df.sortlevel()
df2
df2.index.lexsort_depth
df2.loc[(1,'z')]
- Bug in concat of Series with ``category`` dtype which were coercing to ``object``. (:issue:`8641`)
- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters. ``Series.all``, ``Series.any``, ``Index.all``, and ``Index.any`` no longer support the ``out`` and ``keepdims`` parameters, which existed for compatibility with ndarray. Various index types no longer support the ``all`` and ``any`` aggregation functions and will now raise ``TypeError``. (:issue:`8302`):
.. ipython:: python
s = pd.Series([False, True, False], index=[0, 0, 1])
s.any(level=0)
- ``Panel`` now supports the ``all`` and ``any`` aggregation functions. (:issue:`8302`):
.. ipython:: python
p = pd.Panel(np.random.rand(2, 5, 4) > 0.1)
p.all()
.. _whatsnew_0152.enhancements:
Enhancements
~~~~~~~~~~~~
- Added ability to export Categorical data to Stata (:issue:`8633`). See :ref:`here <io.stata-categorical>` for limitations of categorical variables exported to Stata data files.
- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
- Added support for ``utcfromtimestamp()``, ``fromtimestamp()``, and ``combine()`` on `Timestamp` class (:issue:`5351`).
- Added Google Analytics (`pandas.io.ga`) basic documentation (:issue:`8835`). See :ref:`here<remote_data.ga>`.
- Added flag ``order_categoricals`` to ``StataReader`` and ``read_stata`` to select whether to order imported categorical data (:issue:`8836`). See :ref:`here <io.stata-categorical>` for more information on importing categorical variables from Stata data files.
- ``Timedelta`` arithmetic returns ``NotImplemented`` in unknown cases, allowing extensions by custom classes (:issue:`8813`).
- ``Timedelta`` now supports arithemtic with ``numpy.ndarray`` objects of the appropriate dtype (numpy 1.8 or newer only) (:issue:`8884`).
- Added ``Timedelta.to_timedelta64`` method to the public API (:issue:`8884`).
.. _whatsnew_0152.performance:
Performance
~~~~~~~~~~~
- Reduce memory usage when skiprows is an integer in read_csv (:issue:`8681`)
.. _whatsnew_0152.experimental:
Experimental
~~~~~~~~~~~~
.. _whatsnew_0152.bug_fixes:
Bug Fixes
~~~~~~~~~
- Bug in packaging pandas with ``py2app/cx_Freeze`` (:issue:`8602`, :issue:`8831`)
- Bug in ``groupby`` signatures that didn't include \*args or \*\*kwargs (:issue:`8733`).
- ``io.data.Options`` now raises ``RemoteDataError`` when no expiry dates are available from Yahoo and when it receives no data from Yahoo (:issue:`8761`), (:issue:`8783`).
- Unclear error message in csv parsing when passing dtype and names and the parsed data is a different data type (:issue:`8833`)
- Bug in slicing a multi-index with an empty list and at least one boolean indexer (:issue:`8781`)
- ``io.data.Options`` now raises ``RemoteDataError`` when no expiry dates are available from Yahoo (:issue:`8761`).
- ``Timedelta`` kwargs may now be numpy ints and floats (:issue:`8757`).
- Fixed several outstanding bugs for ``Timedelta`` arithmetic and comparisons (:issue:`8813`, :issue:`5963`, :issue:`5436`).
- ``sql_schema`` now generates dialect appropriate ``CREATE TABLE`` statements (:issue:`8697`)
- ``slice`` string method now takes step into account (:issue:`8754`)
- Bug in ``BlockManager`` where setting values with different type would break block integrity (:issue:`8850`)
- Fix negative step support for label-based slices (:issue:`8753`)
Old behavior:
.. code-block:: python
In [1]: s = pd.Series(np.arange(3), ['a', 'b', 'c'])
Out[1]:
a 0
b 1
c 2
dtype: int64
In [2]: s.loc['c':'a':-1]
Out[2]:
c 2
dtype: int64
New behavior:
.. ipython:: python
s = pd.Series(np.arange(3), ['a', 'b', 'c'])
s.loc['c':'a':-1]
- Imported categorical variables from Stata files retain the ordinal information in the underlying data (:issue:`8836`).
- Defined ``.size`` attribute across ``NDFrame`` objects to provide compat with numpy >= 1.9.1; buggy with ``np.array_split`` (:issue:`8846`)
- Skip testing of histogram plots for matplotlib <= 1.2 (:issue:`8648`).
- Bug where ``get_data_google``returned object dtypes (:issue:`3995`)
- BUG: Option context applies on __enter__ (:issue:`8514`)
- Bug in `pd.infer_freq`/`DataFrame.inferred_freq` that prevented proper sub-daily frequency inference
when the index contained DST days (:issue:`8772`).
- Bug where index name was still used when plotting a series with ``use_index=False`` (:issue:`8558`).
- Bugs when trying to stack multiple columns, when some (or all)
of the level names are numbers (:issue:`8584`).
- Bug in ``MultiIndex`` where ``__contains__`` returns wrong result if index is
not lexically sorted or unique (:issue:`7724`)
- BUG CSV: fix problem with trailing whitespace in skipped rows, (:issue:`8679`), (:issue:`8661`)
- Regression in ``Timestamp`` does not parse 'Z' zone designator for UTC (:issue:`8771`)