-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: some reviewing of the 0.20 whatsnew file #16254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jorisvandenbossche
merged 2 commits into
pandas-dev:master
from
jorisvandenbossche:whatsnew-review
May 5, 2017
Merged
Changes from 1 commit
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,14 +14,13 @@ Highlights include: | |
- The ``.ix`` indexer has been deprecated, see :ref:`here <whatsnew_0200.api_breaking.deprecate_ix>` | ||
- ``Panel`` has been deprecated, see :ref:`here <whatsnew_0200.api_breaking.deprecate_panel>` | ||
- Addition of an ``IntervalIndex`` and ``Interval`` scalar type, see :ref:`here <whatsnew_0200.enhancements.intervalindex>` | ||
- Improved user API when accessing levels in ``.groupby()``, see :ref:`here <whatsnew_0200.enhancements.groupby_access>` | ||
- Improved user API when grouping by index levels in ``.groupby()``, see :ref:`here <whatsnew_0200.enhancements.groupby_access>` | ||
- Improved support for ``UInt64`` dtypes, see :ref:`here <whatsnew_0200.enhancements.uint64_support>` | ||
- A new orient for JSON serialization, ``orient='table'``, that uses the :ref:`Table Schema spec <whatsnew_0200.enhancements.table_schema>` | ||
- Experimental support for exporting ``DataFrame.style`` formats to Excel, see :ref:`here <whatsnew_0200.enhancements.style_excel>` | ||
- A new orient for JSON serialization, ``orient='table'``, that uses the Table Schema spec and that gives the possibility for a more interactive repr in the Jupyter Notebook, see :ref:`here <whatsnew_0200.enhancements.table_schema>` | ||
- Experimental support for exporting styled DataFrames (``DataFrame.style``) to Excel, see :ref:`here <whatsnew_0200.enhancements.style_excel>` | ||
- Window binary corr/cov operations now return a MultiIndexed ``DataFrame`` rather than a ``Panel``, as ``Panel`` is now deprecated, see :ref:`here <whatsnew_0200.api_breaking.rolling_pairwise>` | ||
- Support for S3 handling now uses ``s3fs``, see :ref:`here <whatsnew_0200.api_breaking.s3>` | ||
- Google BigQuery support now uses the ``pandas-gbq`` library, see :ref:`here <whatsnew_0200.api_breaking.gbq>` | ||
- Switched the test framework to use `pytest <http://doc.pytest.org/en/latest>`__ (:issue:`13097`) | ||
|
||
.. warning:: | ||
|
||
|
@@ -41,12 +40,12 @@ New features | |
|
||
.. _whatsnew_0200.enhancements.agg: | ||
|
||
``agg`` API | ||
^^^^^^^^^^^ | ||
``agg`` API for DataFrame/Series | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Series & DataFrame have been enhanced to support the aggregation API. This is a familiar API | ||
from groupby, window operations, and resampling. This allows aggregation operations in a concise | ||
by using :meth:`~DataFrame.agg`, and :meth:`~DataFrame.transform`. The full documentation | ||
from groupby, window operations, and resampling. This allows aggregation operations in a concise way | ||
by using :meth:`~DataFrame.agg` and :meth:`~DataFrame.transform`. The full documentation | ||
is :ref:`here <basics.aggregate>` (:issue:`1623`). | ||
|
||
Here is a sample | ||
|
@@ -107,22 +106,14 @@ aggregations. This is similiar to how groupby ``.agg()`` works. (:issue:`15015`) | |
``dtype`` keyword for data IO | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The ``'python'`` engine for :func:`read_csv` now accepts the ``dtype`` keyword argument for specifying the types of specific columns (:issue:`14295`). See the :ref:`io docs <io.dtypes>` for more information. | ||
The ``'python'`` engine for :func:`read_csv`, as well as the :func:`read_fwf` function for parsing | ||
fixed-width text files and :func:`read_excel` for parsing Excel files, now accept the ``dtype`` keyword argument for specifying the types of specific columns (:issue:`14295`). See the :ref:`io docs <io.dtypes>` for more information. | ||
|
||
.. ipython:: python | ||
:suppress: | ||
|
||
from pandas.compat import StringIO | ||
|
||
.. ipython:: python | ||
|
||
data = "a,b\n1,2\n3,4" | ||
pd.read_csv(StringIO(data), engine='python').dtypes | ||
pd.read_csv(StringIO(data), engine='python', dtype={'a':'float64', 'b':'object'}).dtypes | ||
|
||
The ``dtype`` keyword argument is also now supported in the :func:`read_fwf` function for parsing | ||
fixed-width text files, and :func:`read_excel` for parsing Excel files. | ||
|
||
.. ipython:: python | ||
|
||
data = "a b\n1 2\n3 4" | ||
|
@@ -135,16 +126,16 @@ fixed-width text files, and :func:`read_excel` for parsing Excel files. | |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
:func:`to_datetime` has gained a new parameter, ``origin``, to define a reference date | ||
from where to compute the resulting ``DatetimeIndex`` when ``unit`` is specified. (:issue:`11276`, :issue:`11745`) | ||
from where to compute the resulting timestamps when parsing numerical values with a specific ``unit`` specified. (:issue:`11276`, :issue:`11745`) | ||
|
||
Start with 1960-01-01 as the starting date | ||
For example, with 1960-01-01 as the starting date: | ||
|
||
.. ipython:: python | ||
|
||
pd.to_datetime([1, 2, 3], unit='D', origin=pd.Timestamp('1960-01-01')) | ||
|
||
The default is set at ``origin='unix'``, which defaults to ``1970-01-01 00:00:00``. | ||
Commonly called 'unix epoch' or POSIX time. This was the previous default, so this is a backward compatible change. | ||
The default is set at ``origin='unix'``, which defaults to ``1970-01-01 00:00:00``, which is | ||
commonly called 'unix epoch' or POSIX time. This was the previous default, so this is a backward compatible change. | ||
|
||
.. ipython:: python | ||
|
||
|
@@ -156,7 +147,7 @@ Commonly called 'unix epoch' or POSIX time. This was the previous default, so th | |
Groupby Enhancements | ||
^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Strings passed to ``DataFrame.groupby()`` as the ``by`` parameter may now reference either column names or index level names. | ||
Strings passed to ``DataFrame.groupby()`` as the ``by`` parameter may now reference either column names or index level names. Previously, only column names could be referenced. This allows to easily group by a column and index level at the same time. (:issue:`5677`) | ||
|
||
.. ipython:: python | ||
|
||
|
@@ -172,8 +163,6 @@ Strings passed to ``DataFrame.groupby()`` as the ``by`` parameter may now refere | |
|
||
df.groupby(['second', 'A']).sum() | ||
|
||
Previously, only column names could be referenced. (:issue:`5677`) | ||
|
||
|
||
.. _whatsnew_0200.enhancements.compressed_urls: | ||
|
||
|
@@ -203,7 +192,7 @@ support for bz2 compression in the python 2 C-engine improved (:issue:`14874`). | |
Pickle file I/O now supports compression | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
:func:`read_pickle`, :meth:`DataFame.to_pickle` and :meth:`Series.to_pickle` | ||
:func:`read_pickle`, :meth:`DataFrame.to_pickle` and :meth:`Series.to_pickle` | ||
can now read from and write to compressed pickle files. Compression methods | ||
can be an explicit parameter or be inferred from the file extension. | ||
See :ref:`the docs here. <io.pickle.compression>` | ||
|
@@ -221,33 +210,24 @@ Using an explicit compression type | |
|
||
df.to_pickle("data.pkl.compress", compression="gzip") | ||
rt = pd.read_pickle("data.pkl.compress", compression="gzip") | ||
rt | ||
rt.head() | ||
|
||
Inferring compression type from the extension | ||
|
||
.. ipython:: python | ||
|
||
df.to_pickle("data.pkl.xz", compression="infer") | ||
rt = pd.read_pickle("data.pkl.xz", compression="infer") | ||
rt | ||
|
||
The default is to ``infer``: | ||
The default is to inferring compression type from the extension (``compression='infer'``): | ||
|
||
.. ipython:: python | ||
|
||
df.to_pickle("data.pkl.gz") | ||
rt = pd.read_pickle("data.pkl.gz") | ||
rt | ||
rt.head() | ||
df["A"].to_pickle("s1.pkl.bz2") | ||
rt = pd.read_pickle("s1.pkl.bz2") | ||
rt | ||
rt.head() | ||
|
||
.. ipython:: python | ||
:suppress: | ||
|
||
import os | ||
os.remove("data.pkl.compress") | ||
os.remove("data.pkl.xz") | ||
os.remove("data.pkl.gz") | ||
os.remove("s1.pkl.bz2") | ||
|
||
|
@@ -293,15 +273,15 @@ In previous versions, ``.groupby(..., sort=False)`` would fail with a ``ValueErr | |
ordered=True)}) | ||
df | ||
|
||
Previous Behavior: | ||
**Previous Behavior**: | ||
|
||
.. code-block:: ipython | ||
|
||
In [3]: df[df.chromosomes != '1'].groupby('chromosomes', sort=False).sum() | ||
--------------------------------------------------------------------------- | ||
ValueError: items in new_categories are not the same as in old categories | ||
|
||
New Behavior: | ||
**New Behavior**: | ||
|
||
.. ipython:: python | ||
|
||
|
@@ -327,7 +307,7 @@ the data. | |
df.to_json(orient='table') | ||
|
||
|
||
See :ref:`IO: Table Schema for more<io.table_schema>`. | ||
See :ref:`IO: Table Schema for more information <io.table_schema>`. | ||
|
||
Additionally, the repr for ``DataFrame`` and ``Series`` can now publish | ||
this JSON Table schema representation of the Series or DataFrame if you are | ||
|
@@ -411,6 +391,11 @@ pandas has gained an ``IntervalIndex`` with its own dtype, ``interval`` as well | |
notation, specifically as a return type for the categories in :func:`cut` and :func:`qcut`. The ``IntervalIndex`` allows some unique indexing, see the | ||
:ref:`docs <indexing.intervallindex>`. (:issue:`7640`, :issue:`8625`) | ||
|
||
.. warning:: | ||
|
||
These indexing behaviors of the IntervalIndex are provisional and may change in a future version of pandas. Feedback on usage very welcome. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "is very welcome" (or just "is welcome") |
||
|
||
|
||
Previous behavior: | ||
|
||
The returned categories were strings, representing Intervals | ||
|
@@ -473,9 +458,8 @@ Other Enhancements | |
- ``Series.str.replace()`` now accepts a callable, as replacement, which is passed to ``re.sub`` (:issue:`15055`) | ||
- ``Series.str.replace()`` now accepts a compiled regular expression as a pattern (:issue:`15446`) | ||
- ``Series.sort_index`` accepts parameters ``kind`` and ``na_position`` (:issue:`13589`, :issue:`14444`) | ||
- ``DataFrame`` has gained a ``nunique()`` method to count the distinct values over an axis (:issue:`14336`). | ||
- ``DataFrame`` and ``DataFrame.groupby()`` have gained a ``nunique()`` method to count the distinct values over an axis (:issue:`14336`, :issue:`15197`). | ||
- ``DataFrame`` has gained a ``melt()`` method, equivalent to ``pd.melt()``, for unpivoting from a wide to long format (:issue:`12640`). | ||
- ``DataFrame.groupby()`` has gained a ``.nunique()`` method to count the distinct values for all columns within each group (:issue:`14336`, :issue:`15197`). | ||
- ``pd.read_excel()`` now preserves sheet order when using ``sheetname=None`` (:issue:`9930`) | ||
- Multiple offset aliases with decimal points are now supported (e.g. ``0.5min`` is parsed as ``30s``) (:issue:`8419`) | ||
- ``.isnull()`` and ``.notnull()`` have been added to ``Index`` object to make them more consistent with the ``Series`` API (:issue:`15300`) | ||
|
@@ -506,9 +490,8 @@ Other Enhancements | |
- ``DataFrame.to_excel()`` has a new ``freeze_panes`` parameter to turn on Freeze Panes when exporting to Excel (:issue:`15160`) | ||
- ``pd.read_html()`` will parse multiple header rows, creating a MutliIndex header. (:issue:`13434`). | ||
- HTML table output skips ``colspan`` or ``rowspan`` attribute if equal to 1. (:issue:`15403`) | ||
- :class:`pandas.io.formats.style.Styler`` template now has blocks for easier extension, :ref:`see the example notebook <style.ipynb#Subclassing>` (:issue:`15649`) | ||
- :meth:`pandas.io.formats.style.Styler.render` now accepts ``**kwargs`` to allow user-defined variables in the template (:issue:`15649`) | ||
- ``pd.io.api.Styler.render`` now accepts ``**kwargs`` to allow user-defined variables in the template (:issue:`15649`) | ||
- :class:`pandas.io.formats.style.Styler` template now has blocks for easier extension, :ref:`see the example notebook <style.ipynb#Subclassing>` (:issue:`15649`) | ||
- :meth:`Styler.render() <pandas.io.formats.style.Styler.render>` now accepts ``**kwargs`` to allow user-defined variables in the template (:issue:`15649`) | ||
- Compatibility with Jupyter notebook 5.0; MultiIndex column labels are left-aligned and MultiIndex row-labels are top-aligned (:issue:`15379`) | ||
- ``TimedeltaIndex`` now has a custom date-tick formatter specifically designed for nanosecond level precision (:issue:`8711`) | ||
- ``pd.api.types.union_categoricals`` gained the ``ignore_ordered`` argument to allow ignoring the ordered attribute of unioned categoricals (:issue:`13410`). See the :ref:`categorical union docs <categorical.union>` for more information. | ||
|
@@ -519,7 +502,7 @@ Other Enhancements | |
- ``pandas.io.json.json_normalize()`` gained the option ``errors='ignore'|'raise'``; the default is ``errors='raise'`` which is backward compatible. (:issue:`14583`) | ||
- ``pandas.io.json.json_normalize()`` with an empty ``list`` will return an empty ``DataFrame`` (:issue:`15534`) | ||
- ``pandas.io.json.json_normalize()`` has gained a ``sep`` option that accepts ``str`` to separate joined fields; the default is ".", which is backward compatible. (:issue:`14883`) | ||
- :meth:`~MultiIndex.remove_unused_levels` has been added to facilitate :ref:`removing unused levels <advanced.shown_levels>`. (:issue:`15694`) | ||
- :meth:`MultiIndex.remove_unused_levels` has been added to facilitate :ref:`removing unused levels <advanced.shown_levels>`. (:issue:`15694`) | ||
- ``pd.read_csv()`` will now raise a ``ParserError`` error whenever any parsing error occurs (:issue:`15913`, :issue:`15925`) | ||
- ``pd.read_csv()`` now supports the ``error_bad_lines`` and ``warn_bad_lines`` arguments for the Python parser (:issue:`15925`) | ||
- The ``display.show_dimensions`` option can now also be used to specify | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is to inferring" -> "is to infer"