forked from pandas-dev/pandas
-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathv0.17.1.txt
executable file
·162 lines (119 loc) · 8.84 KB
/
v0.17.1.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
.. _whatsnew_0171:
v0.17.1 (November 21, 2015)
---------------------------
This is a minor bug-fix release from 0.17.0 and includes a large number of
bug fixes along several new features, enhancements, and performance improvements.
We recommend that all users upgrade to this version.
Highlights include:
- Support for Conditional HTML Formatting, see :ref:`here <whatsnew_0171.style>`
- Releasing the GIL on the csv reader & other ops, see :ref:`here <whatsnew_0171.performance>`
- Regression in ``DataFrame.drop_duplicates`` from 0.16.2, causing incorrect results on integer values (:issue:`11376`)
.. contents:: What's new in v0.17.1
:local:
:backlinks: none
New features
~~~~~~~~~~~~
.. _whatsnew_0171.style:
Conditional HTML Formatting
^^^^^^^^^^^^^^^^^^^^^^^^^^^
We've added *experimental* support for conditional HTML formatting,
the visual styling of a DataFrame based on the data.
The styling is accomplished with HTML and CSS.
Acesses the styler class with :attr:`pandas.DataFrame.style`, attribute,
an instance of :class:`~pandas.core.style.Styler` with your data attached.
See the :ref:`example notebook <style>` for more.
.. _whatsnew_0171.enhancements:
Enhancements
~~~~~~~~~~~~
- ``DatetimeIndex`` now supports conversion to strings with ``astype(str)`` (:issue:`10442`)
- Support for ``compression`` (gzip/bz2) in :meth:`pandas.DataFrame.to_csv` (:issue:`7615`)
- Improve the error message in :func:`pandas.io.gbq.to_gbq` when a streaming insert fails (:issue:`11285`)
- ``pd.read_*`` functions can now also accept :class:`python:pathlib.Path`, or :class:`py:py._path.local.LocalPath`
objects for the ``filepath_or_buffer`` argument. (:issue:`11033`)
- ``DataFrame`` now uses the fields of a ``namedtuple`` as columns, if columns are not supplied (:issue:`11181`)
- Improve the error message displayed in :func:`pandas.io.gbq.to_gbq` when the DataFrame does not match the schema of the destination table (:issue:`11359`)
- Added ``axvlines_kwds`` to parallel coordinates plot (:issue:`10709`)
- Option to ``.info()`` and ``.memory_usage()`` to provide for deep introspection of memory consumption. Note that this can be expensive to compute and therefore is an optional parameter. (:issue:`11595`)
.. ipython:: python
df = DataFrame({'A' : ['foo']*1000})
df['B'] = df['A'].astype('category')
# shows the '+' as we have object dtypes
df.info()
# we have an accurate memory assessment (but can be expensive to compute this)
df.info(memory_usage='deep')
- ``Index`` now has a ``fillna`` method (:issue:`10089`)
.. ipython:: python
pd.Index([1, np.nan, 3]).fillna(2)
- ``pivot_table`` now has a ``margins_name`` argument so you can use something other than the default of 'All' (:issue:`3335`)
- Implement export of ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`11411`)
.. _whatsnew_0171.api:
API changes
~~~~~~~~~~~
- min and max reductions on ``datetime64`` and ``timedelta64`` dtyped series now
result in ``NaT`` and not ``nan`` (:issue:`11245`).
- Regression from 0.16.2 for output formatting of long floats/nan, restored in (:issue:`11302`)
- Prettyprinting sets (e.g. in DataFrame cells) now uses set literal syntax (``{x, y}``) instead of
Legacy Python syntax (``set([x, y])``) (:issue:`11215`)
- Indexing with a null key will raise a ``TypeError``, instead of a ``ValueError`` (:issue:`11356`)
- ``Series.sort_index()`` now correctly handles the ``inplace`` option (:issue:`11402`)
- ``DataFrame.itertuples()`` now returns ``namedtuple`` objects, when possible. (:issue:`11269`)
- ``Series.ptp`` will now ignore missing values by default (:issue:`11163`)
.. _whatsnew_0171.deprecations:
Deprecations
^^^^^^^^^^^^
- The ``pandas.io.ga`` module which implements ``google-analytics`` support is deprecated and will be removed in a future version (:issue:`11308`)
- Deprecate the ``engine`` keyword in ``.to_csv()``, which will be removed in a future version (:issue:`11274`)
.. _whatsnew_0171.performance:
Performance Improvements
~~~~~~~~~~~~~~~~~~~~~~~~
- Checking monotonic-ness before sorting on an index (:issue:`11080`)
- ``Series.dropna`` performance improvement when its dtype can't contain ``NaN`` (:issue:`11159`)
- Release the GIL on most datetime field operations (e.g. ``DatetimeIndex.year``, ``Series.dt.year``), normalization, and conversion to and from ``Period``, ``DatetimeIndex.to_period`` and ``PeriodIndex.to_timestamp`` (:issue:`11263`)
- Release the GIL on some rolling algos (``rolling_median``, ``rolling_mean``, ``rolling_max``, ``rolling_min``, ``rolling_var``, ``rolling_kurt``, ``rolling_skew`` (:issue:`11450`)
- Release the GIL when reading and parsing text files in ``read_csv``, ``read_table`` (:issue:`11272`)
- Improved performance of ``rolling_median`` (:issue:`11450`)
- Improved performance to ``to_excel`` (:issue:`11352`)
- Performance bug in repr of ``Categorical`` categories, which was rendering the strings before chopping them for display (:issue:`11305`)
- Improved performance of ``Series`` constructor with no data and ``DatetimeIndex`` (:issue:`11433`)
- Improved performance of ``shift``, ``cumprod``, and ``cumsum`` with groupby (:issue:`4095`)
.. _whatsnew_0171.bug_fixes:
Bug Fixes
~~~~~~~~~
- Incorrectly distributed .c file in the build on ``PyPi`` when reading a csv of floats and passing ``na_values=<a scalar>`` would show an exception (:issue:`11374`)
- Bug in ``.to_latex()`` output broken when the index has a name (:issue:`10660`)
- Bug in ``HDFStore.append`` with strings whose encoded length exceded the max unencoded length (:issue:`11234`)
- Bug in merging ``datetime64[ns, tz]`` dtypes (:issue:`11405`)
- Bug in ``HDFStore.select`` when comparing with a numpy scalar in a where clause (:issue:`11283`)
- Bug in using ``DataFrame.ix`` with a multi-index indexer (:issue:`11372`)
- Bug in ``date_range`` with ambigous endpoints (:issue:`11626`)
- Prevent adding new attributes to the accessors ``.str``, ``.dt`` and ``.cat``. Retrieving such
a value was not possible, so error out on setting it. (:issue:`10673`)
- Bug in tz-conversions with an ambiguous time and ``.dt`` accessors (:issue:`11295`)
- Bug in comparisons of Series vs list-likes (:issue:`11339`)
- Bug in ``DataFrame.replace`` with a ``datetime64[ns, tz]`` and a non-compat to_replace (:issue:`11326`, :issue:`11153`)
- Bug in list-like indexing with a mixed-integer Index (:issue:`11320`)
- Bug in ``pivot_table`` with ``margins=True`` when indexes are of ``Categorical`` dtype (:issue:`10993`)
- Bug in ``DataFrame.plot`` cannot use hex strings colors (:issue:`10299`)
- Regression in ``DataFrame.drop_duplicates`` from 0.16.2, causing incorrect results on integer values (:issue:`11376`)
- Bug in ``pd.eval`` where unary ops in a list error (:issue:`11235`)
- Bug in ``squeeze()`` with zero length arrays (:issue:`11230`, :issue:`8999`)
- Bug in ``describe()`` dropping column names for hierarchical indexes (:issue:`11517`)
- Bug in ``DataFrame.pct_change()`` not propagating ``axis`` keyword on ``.fillna`` method (:issue:`11150`)
- Bug in ``to_sql`` using unicode column names giving UnicodeEncodeError with (:issue:`11431`).
- Fix regression in setting of ``xticks`` in ``plot`` (:issue:`11529`).
- Bug in ``holiday.dates`` where observance rules could not be applied to holiday and doc enhancement (:issue:`11477`, :issue:`11533`)
- Fix plotting issues when having plain ``Axes`` instances instead of ``SubplotAxes`` (:issue:`11520`, :issue:`11556`).
- Bug in ``DataFrame.to_latex()`` produces an extra rule when ``header=False`` (:issue:`7124`)
- Bug in ``df.groupby(...).apply(func)`` when a func returns a ``Series`` containing a new datetimelike column (:issue:`11324`)
- Bug in ``pandas.json`` when file to load is big (:issue:`11344`)
- Bugs in ``to_excel`` with duplicate columns (:issue:`11007`, :issue:`10982`, :issue:`10970`)
- Fixed a bug that prevented the construction of an empty series of dtype ``datetime64[ns, tz]`` (:issue:`11245`).
- Bug in ``read_excel`` with multi-index containing integers (:issue:`11317`)
- Bug in ``to_excel`` with openpyxl 2.2+ and merging (:issue:`11408`)
- Bug in ``DataFrame.to_dict()`` produces a ``np.datetime64`` object instead of ``Timestamp`` when only datetime is present in data (:issue:`11327`)
- Bug in ``DataFrame.corr()`` raises exception when computes Kendall correlation for DataFrames with boolean and not boolean columns (:issue:`11560`)
- Bug in the link-time error caused by C ``inline`` functions on FreeBSD 10+ (with ``clang``) (:issue:`10510`)
- Bug in ``DataFrame.to_csv`` in passing through arguments for formatting ``MultiIndexes``, including ``date_format`` (:issue:`7791`)
- Bug in ``DataFrame.join()`` with ``how='right'`` producing a ``TypeError`` (:issue:`11519`)
- Bug in ``Series.quantile`` with empty list results has ``Index`` with ``object`` dtype (:issue:`11588`)
- Bug in ``pd.merge`` results in empty ``Int64Index`` rather than ``Index(dtype=object)`` when the merge result is empty (:issue:`11588`)