Skip to content

ENH: tilde expansion for write functions, #11438 #11458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
147 changes: 111 additions & 36 deletions doc/source/whatsnew/v0.17.1.txt
Original file line number Diff line number Diff line change
@@ -1,52 +1,124 @@
.. _whatsnew_0171:

v0.17.1 (November ??, 2015)
v0.17.1 (November 21, 2015)
---------------------------

This is a minor bug-fix release from 0.17.0 and includes a a large number of
This is a minor bug-fix release from 0.17.0 and includes a large number of
bug fixes along several new features, enhancements, and performance improvements.
We recommend that all users upgrade to this version.

Highlights include:

- Support for Conditional HTML Formatting, see :ref:`here <whatsnew_0171.style>`
- Releasing the GIL on the csv reader & other ops, see :ref:`here <whatsnew_0171.performance>`
- Regression in ``DataFrame.drop_duplicates`` from 0.16.2, causing incorrect results on integer values (:issue:`11376`)

.. contents:: What's new in v0.17.1
:local:
:backlinks: none

New features
~~~~~~~~~~~~

.. _whatsnew_0171.style:

Conditional HTML Formatting
^^^^^^^^^^^^^^^^^^^^^^^^^^^

We've added *experimental* support for conditional HTML formatting:
the visual styling of a DataFrame based on the data.
The styling is accomplished with HTML and CSS.
Acesses the styler class with :attr:`pandas.DataFrame.style`, attribute,
an instance of :class:`~pandas.core.style.Styler` with your data attached.

Here's a quick example:

.. ipython:: python

np.random.seed(123)
df = DataFrame(np.random.randn(10, 5), columns=list('abcde'))
html = df.style.background_gradient(cmap='viridis', low=.5)

We can render the HTML to get the following table.

.. raw:: html
:file: whatsnew_0171_html_table.html

See the :ref:`example notebook <style>` for more.

.. _whatsnew_0171.enhancements:

Enhancements
~~~~~~~~~~~~
- ``DatetimeIndex`` now supports conversion to strings with astype(str) (:issue:`10442`)
- ``DatetimeIndex`` now supports conversion to strings with ``astype(str)`` (:issue:`10442`)

- Support for ``compression`` (gzip/bz2) in :meth:`pandas.DataFrame.to_csv` (:issue:`7615`)
- Improve the error message in :func:`pandas.io.gbq.to_gbq` when a streaming insert fails (:issue:`11285`)
- ``pd.read_*`` functions can now also accept :class:`python:pathlib.Path`, or :class:`py:py._path.local.LocalPath`
objects for the ``filepath_or_buffer`` argument. (:issue:`11033`)
- ``DataFrame`` now uses the fields of a ``namedtuple`` as columns, if columns are not supplied (:issue:`11181`)
- Improve the error message displayed in :func:`pandas.io.gbq.to_gbq` when the DataFrame does not match the schema of the destination table (:issue:`11359`)
- Added ``axvlines_kwds`` to parallel coordinates plot (:issue:`10709`)

- Option to ``.info()`` and ``.memory_usage()`` to provide for deep introspection of memory consumption. Note that this can be expensive to compute and therefore is an optional parameter. (:issue:`11595`)

.. ipython:: python

df = DataFrame({'A' : ['foo']*1000})
df['B'] = df['A'].astype('category')

# shows the '+' as we have object dtypes
df.info()

# we have an accurate memory assessment (but can be expensive to compute this)
df.info(memory_usage='deep')

- ``Index`` now has a ``fillna`` method (:issue:`10089`)

.. ipython:: python

pd.Index([1, np.nan, 3]).fillna(2)

- Series of type ``"category"`` now make ``.str.<...>`` and ``.dt.<...>`` accessor methods / properties available, if the categories are of that type. (:issue:`10661`)

.. ipython:: python

s = pd.Series(list('aabb')).astype('category')
s
s.str.contains("a")

date = pd.Series(pd.date_range('1/1/2015', periods=5)).astype('category')
date
date.dt.day

- ``pivot_table`` now has a ``margins_name`` argument so you can use something other than the default of 'All' (:issue:`3335`)
- Implement export of ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`11411`)
- The ``DataFrame`` and ``Series`` functions ``.to_csv()``, ``.to_html()`` and ``.to_latex()`` can now handle paths beginning with tildes (e.g. ``~/Documents/``)

.. _whatsnew_0171.api:

API changes
~~~~~~~~~~~

- raise ``NotImplementedError`` in ``Index.shift`` for non-supported index types (:issue:`8083`)
- min and max reductions on ``datetime64`` and ``timedelta64`` dtyped series now
result in ``NaT`` and not ``nan`` (:issue:`11245`).
- Regression from 0.16.2 for output formatting of long floats/nan, restored in (:issue:`11302`)
- Prettyprinting sets (e.g. in DataFrame cells) now uses set literal syntax (``{x, y}``) instead of
Legacy Python syntax (``set([x, y])``) (:issue:`11215`)
- Indexing with a null key will raise a ``TypeError``, instead of a ``ValueError`` (:issue:`11356`)
- ``Series.sort_index()`` now correctly handles the ``inplace`` option (:issue:`11402`)
- ``DataFrame.itertuples()`` now returns ``namedtuple`` objects, when possible. (:issue:`11269`)
- ``SparseArray.__iter__()`` now does not cause ``PendingDeprecationWarning`` in Python 3.5 (:issue:`11622`)

- ``DataFrame.itertuples()`` now returns ``namedtuple`` objects, when possible. (:issue:`11269`, :issue:`11625`)
- ``Series.ptp`` will now ignore missing values by default (:issue:`11163`)

.. _whatsnew_0171.deprecations:

Deprecations
^^^^^^^^^^^^

- The ``pandas.io.ga`` module which implements ``google-analytics`` support is deprecated and will be removed in a future version (:issue:`11308`)
- Deprecate the ``engine`` keyword from ``.to_csv()``, which will be removed in a future version (:issue:`11274`)
- Deprecate the ``engine`` keyword in ``.to_csv()``, which will be removed in a future version (:issue:`11274`)


.. _whatsnew_0171.performance:
Expand All @@ -56,64 +128,67 @@ Performance Improvements

- Checking monotonic-ness before sorting on an index (:issue:`11080`)
- ``Series.dropna`` performance improvement when its dtype can't contain ``NaN`` (:issue:`11159`)


- Release the GIL on most datetime field operations (e.g. ``DatetimeIndex.year``, ``Series.dt.year``), normalization, and conversion to and from ``Period``, ``DatetimeIndex.to_period`` and ``PeriodIndex.to_timestamp`` (:issue:`11263`)


- Release the GIL on some rolling algos (``rolling_median``, ``rolling_mean``, ``rolling_max``, ``rolling_min``, ``rolling_var``, ``rolling_kurt``, ``rolling_skew`` (:issue:`11450`)
- Release the GIL when reading and parsing text files in ``read_csv``, ``read_table`` (:issue:`11272`)
- Improved performance of ``rolling_median`` (:issue:`11450`)
- Improved performance to ``to_excel`` (:issue:`11352`)
- Performance bug in repr of ``Categorical`` categories, which was rendering the strings before chopping them for display (:issue:`11305`)
- Performance improvement in ``Categorical.remove_unused_categories``, (:issue:`11643`).
- Improved performance of ``Series`` constructor with no data and ``DatetimeIndex`` (:issue:`11433`)

- Improved performance of ``shift``, ``cumprod``, and ``cumsum`` with groupby (:issue:`4095`)

.. _whatsnew_0171.bug_fixes:


Bug Fixes
~~~~~~~~~

- Bug in ``.to_latex()`` output broken when the index has a name (:issue: `10660`)
- Incorrectly distributed .c file in the build on ``PyPi`` when reading a csv of floats and passing ``na_values=<a scalar>`` would show an exception (:issue:`11374`)
- Bug in ``.to_latex()`` output broken when the index has a name (:issue:`10660`)
- Bug in ``HDFStore.append`` with strings whose encoded length exceded the max unencoded length (:issue:`11234`)
- Bug in merging ``datetime64[ns, tz]`` dtypes (:issue:`11405`)
- Bug in ``HDFStore.select`` when comparing with a numpy scalar in a where clause (:issue:`11283`)
- Bug in using ``DataFrame.ix`` with a multi-index indexer(:issue:`11372`)


- Bug in using ``DataFrame.ix`` with a multi-index indexer (:issue:`11372`)
- Bug in ``date_range`` with ambigous endpoints (:issue:`11626`)
- Prevent adding new attributes to the accessors ``.str``, ``.dt`` and ``.cat``. Retrieving such
a value was not possible, so error out on setting it. (:issue:`10673`)
- Bug in tz-conversions with an ambiguous time and ``.dt`` accessors (:issue:`11295`)
- Bug in output formatting when using an index of ambiguous times (:issue:`11619`)
- Bug in comparisons of Series vs list-likes (:issue:`11339`)


- Bug in ``DataFrame.replace`` with a ``datetime64[ns, tz]`` and a non-compat to_replace (:issue:`11326`, :issue:`11153`)

- Bug in list-like indexing with a mixed-integer Index (:issue:`11320`)

- Bug in ``pivot_table`` with ``margins=True`` when indexes are of ``Categorical`` dtype (:issue:`10993`)
- Bug in ``DataFrame.plot`` cannot use hex strings colors (:issue:`10299`)

- Bug in ``DataFrame.drop_duplicates`` (regression from 0.16.2) causing some non-duplicate rows containing integer values to be dropped (:issue:`11376`)


- Regression in ``DataFrame.drop_duplicates`` from 0.16.2, causing incorrect results on integer values (:issue:`11376`)
- Bug in ``pd.eval`` where unary ops in a list error (:issue:`11235`)
- Bug in ``squeeze()`` with zero length arrays (:issue:`11230`, :issue:`8999`)
- Bug in ``describe()`` dropping column names for hierarchical indexes (:issue:`11517`)
- Bug in ``DataFrame.pct_change()`` not propagating ``axis`` keyword on ``.fillna`` method (:issue:`11150`)












- Bug in ``to_sql`` using unicode column names giving UnicodeEncodeError with (:issue:`11431`).
- Fix regression in setting of ``xticks`` in ``plot`` (:issue:`11529`).
- Bug in ``holiday.dates`` where observance rules could not be applied to holiday and doc enhancement (:issue:`11477`, :issue:`11533`)
- Fix plotting issues when having plain ``Axes`` instances instead of ``SubplotAxes`` (:issue:`11520`, :issue:`11556`).
- Bug in ``DataFrame.to_latex()`` produces an extra rule when ``header=False`` (:issue:`7124`)


- Bug in ``df.groupby(...).apply(func)`` when a func returns a ``Series`` containing a new datetimelike column (:issue:`11324`)
- Bug in ``pandas.json`` when file to load is big (:issue:`11344`)
- Bugs in ``to_excel`` with duplicate columns (:issue:`11007`, :issue:`10982`, :issue:`10970`)

- Fixed a bug that prevented the construction of an empty series of dtype
``datetime64[ns, tz]`` (:issue:`11245`).

- Fixed a bug that prevented the construction of an empty series of dtype ``datetime64[ns, tz]`` (:issue:`11245`).
- Bug in ``read_excel`` with multi-index containing integers (:issue:`11317`)

- Bug in ``to_excel`` with openpyxl 2.2+ and merging (:issue:`11408`)

- Bug in ``DataFrame.to_dict()`` produces a ``np.datetime64`` object instead of ``Timestamp`` when only datetime is present in data (:issue:`11327`)
- Bug in ``DataFrame.corr()`` raises exception when computes Kendall correlation for DataFrames with boolean and not boolean columns (:issue:`11560`)
- Bug in the link-time error caused by C ``inline`` functions on FreeBSD 10+ (with ``clang``) (:issue:`10510`)
- Bug in ``DataFrame.to_csv`` in passing through arguments for formatting ``MultiIndexes``, including ``date_format`` (:issue:`7791`)
- Bug in ``DataFrame.join()`` with ``how='right'`` producing a ``TypeError`` (:issue:`11519`)
- Bug in ``Series.quantile`` with empty list results has ``Index`` with ``object`` dtype (:issue:`11588`)
- Bug in ``pd.merge`` results in empty ``Int64Index`` rather than ``Index(dtype=object)`` when the merge result is empty (:issue:`11588`)
- Bug in ``Categorical.remove_unused_categories`` when having ``NaN`` values (:issue:`11599`)
- Bug in ``DataFrame.to_sparse()`` loses column names for MultiIndexes (:issue:`11600`)
6 changes: 4 additions & 2 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@
from pandas.tseries.period import PeriodIndex
from pandas.tseries.index import DatetimeIndex
from pandas.tseries.tdi import TimedeltaIndex

from pandas.io.common import _expand_user

import pandas.core.algorithms as algos
import pandas.core.base as base
Expand Down Expand Up @@ -1302,7 +1302,7 @@ def to_csv(self, path_or_buf=None, sep=",", na_rep='', float_format=None,
.. versionadded:: 0.16.0

"""

path_or_buf = _expand_user(path_or_buf)
formatter = fmt.CSVFormatter(self, path_or_buf,
line_terminator=line_terminator,
sep=sep, encoding=encoding,
Expand Down Expand Up @@ -1505,6 +1505,7 @@ def to_html(self, buf=None, columns=None, col_space=None, colSpace=None,
FutureWarning, stacklevel=2)
col_space = colSpace

buf = _expand_user(buf)
formatter = fmt.DataFrameFormatter(self, buf=buf, columns=columns,
col_space=col_space, na_rep=na_rep,
formatters=formatters,
Expand Down Expand Up @@ -1554,6 +1555,7 @@ def to_latex(self, buf=None, columns=None, col_space=None, colSpace=None,
FutureWarning, stacklevel=2)
col_space = colSpace

buf = _expand_user(buf)
formatter = fmt.DataFrameFormatter(self, buf=buf, columns=columns,
col_space=col_space, na_rep=na_rep,
header=header, index=index,
Expand Down