Skip to content

DOC: freeze old whatsnew notes part 1 #6856 #41464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions doc/source/whatsnew/v0.5.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,6 @@ Version 0.5.0 (October 24, 2011)

{{ header }}

.. ipython:: python
:suppress:

from pandas import * # noqa F401, F403


New features
~~~~~~~~~~~~

Expand Down
6 changes: 0 additions & 6 deletions doc/source/whatsnew/v0.6.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,6 @@ Version 0.6.0 (November 25, 2011)

{{ header }}

.. ipython:: python
:suppress:

from pandas import * # noqa F401, F403


New features
~~~~~~~~~~~~
- :ref:`Added <reshaping.melt>` ``melt`` function to ``pandas.core.reshape``
Expand Down
111 changes: 92 additions & 19 deletions doc/source/whatsnew/v0.7.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,22 @@ New features

- Handle differently-indexed output values in ``DataFrame.apply`` (:issue:`498`)

.. ipython:: python
.. code-block:: ipython

df = pd.DataFrame(np.random.randn(10, 4))
df.apply(lambda x: x.describe())
In [1]: df = pd.DataFrame(np.random.randn(10, 4))
In [2]: df.apply(lambda x: x.describe())
Out[2]:
0 1 2 3
count 10.000000 10.000000 10.000000 10.000000
mean 0.190912 -0.395125 -0.731920 -0.403130
std 0.730951 0.813266 1.112016 0.961912
min -0.861849 -2.104569 -1.776904 -1.469388
25% -0.411391 -0.698728 -1.501401 -1.076610
50% 0.380863 -0.228039 -1.191943 -1.004091
75% 0.658444 0.057974 -0.034326 0.461706
max 1.212112 0.577046 1.643563 1.071804

[8 rows x 4 columns]

- :ref:`Add<advanced.reorderlevels>` ``reorder_levels`` method to Series and
DataFrame (:issue:`534`)
Expand Down Expand Up @@ -116,13 +128,31 @@ One of the potentially riskiest API changes in 0.7.0, but also one of the most
important, was a complete review of how **integer indexes** are handled with
regard to label-based indexing. Here is an example:

.. ipython:: python
.. code-block:: ipython

s = pd.Series(np.random.randn(10), index=range(0, 20, 2))
s
s[0]
s[2]
s[4]
In [3]: s = pd.Series(np.random.randn(10), index=range(0, 20, 2))
In [4]: s
Out[4]:
0 -1.294524
2 0.413738
4 0.276662
6 -0.472035
8 -0.013960
10 -0.362543
12 -0.006154
14 -0.923061
16 0.895717
18 0.805244
Length: 10, dtype: float64

In [5]: s[0]
Out[5]: -1.2945235902555294

In [6]: s[2]
Out[6]: 0.41373810535784006

In [7]: s[4]
Out[7]: 0.2766617129497566

This is all exactly identical to the behavior before. However, if you ask for a
key **not** contained in the Series, in versions 0.6.1 and prior, Series would
Expand Down Expand Up @@ -235,22 +265,65 @@ slice to a Series when getting and setting values via ``[]`` (i.e. the
``__getitem__`` and ``__setitem__`` methods). The behavior will be the same as
passing similar input to ``ix`` **except in the case of integer indexing**:

.. ipython:: python
.. code-block:: ipython

s = pd.Series(np.random.randn(6), index=list('acegkm'))
s
s[['m', 'a', 'c', 'e']]
s['b':'l']
s['c':'k']
In [8]: s = pd.Series(np.random.randn(6), index=list('acegkm'))

In [9]: s
Out[9]:
a -1.206412
c 2.565646
e 1.431256
g 1.340309
k -1.170299
m -0.226169
Length: 6, dtype: float64

In [10]: s[['m', 'a', 'c', 'e']]
Out[10]:
m -0.226169
a -1.206412
c 2.565646
e 1.431256
Length: 4, dtype: float64

In [11]: s['b':'l']
Out[11]:
c 2.565646
e 1.431256
g 1.340309
k -1.170299
Length: 4, dtype: float64

In [12]: s['c':'k']
Out[12]:
c 2.565646
e 1.431256
g 1.340309
k -1.170299
Length: 4, dtype: float64

In the case of integer indexes, the behavior will be exactly as before
(shadowing ``ndarray``):

.. ipython:: python
.. code-block:: ipython

s = pd.Series(np.random.randn(6), index=range(0, 12, 2))
s[[4, 0, 2]]
s[1:5]
In [13]: s = pd.Series(np.random.randn(6), index=range(0, 12, 2))

In [14]: s[[4, 0, 2]]
Out[14]:
4 0.132003
0 0.410835
2 0.813850
Length: 3, dtype: float64

In [15]: s[1:5]
Out[15]:
2 0.813850
4 0.132003
6 -0.827317
8 -0.076467
Length: 4, dtype: float64

If you wish to do indexing with sequences and slicing on an integer index with
label semantics, use ``ix``.
Expand Down
90 changes: 68 additions & 22 deletions doc/source/whatsnew/v0.7.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,21 +51,37 @@ NA boolean comparison API change
Reverted some changes to how NA values (represented typically as ``NaN`` or
``None``) are handled in non-numeric Series:

.. ipython:: python
.. code-block:: ipython

series = pd.Series(["Steve", np.nan, "Joe"])
series == "Steve"
series != "Steve"
In [1]: series = pd.Series(["Steve", np.nan, "Joe"])

In [2]: series == "Steve"
Out[2]:
0 True
1 False
2 False
Length: 3, dtype: bool

In [3]: series != "Steve"
Out[3]:
0 False
1 True
2 True
Length: 3, dtype: bool

In comparisons, NA / NaN will always come through as ``False`` except with
``!=`` which is ``True``. *Be very careful* with boolean arithmetic, especially
negation, in the presence of NA data. You may wish to add an explicit NA
filter into boolean array operations if you are worried about this:

.. ipython:: python
.. code-block:: ipython

In [4]: mask = series == "Steve"

mask = series == "Steve"
series[mask & series.notnull()]
In [5]: series[mask & series.notnull()]
Out[5]:
0 Steve
Length: 1, dtype: object

While propagating NA in comparisons may seem like the right behavior to some
users (and you could argue on purely technical grounds that this is the right
Expand All @@ -80,21 +96,51 @@ Other API changes
When calling ``apply`` on a grouped Series, the return value will also be a
Series, to be more consistent with the ``groupby`` behavior with DataFrame:

.. ipython:: python
:okwarning:

df = pd.DataFrame(
{
"A": ["foo", "bar", "foo", "bar", "foo", "bar", "foo", "foo"],
"B": ["one", "one", "two", "three", "two", "two", "one", "three"],
"C": np.random.randn(8),
"D": np.random.randn(8),
}
)
df
grouped = df.groupby("A")["C"]
grouped.describe()
grouped.apply(lambda x: x.sort_values()[-2:]) # top 2 values
.. code-block:: ipython

In [6]: df = pd.DataFrame(
...: {
...: "A": ["foo", "bar", "foo", "bar", "foo", "bar", "foo", "foo"],
...: "B": ["one", "one", "two", "three", "two", "two", "one", "three"],
...: "C": np.random.randn(8),
...: "D": np.random.randn(8),
...: }
...: )
...:

In [7]: df
Out[7]:
A B C D
0 foo one 0.469112 -0.861849
1 bar one -0.282863 -2.104569
2 foo two -1.509059 -0.494929
3 bar three -1.135632 1.071804
4 foo two 1.212112 0.721555
5 bar two -0.173215 -0.706771
6 foo one 0.119209 -1.039575
7 foo three -1.044236 0.271860

[8 rows x 4 columns]

In [8]: grouped = df.groupby("A")["C"]

In [9]: grouped.describe()
Out[9]:
count mean std min 25% 50% 75% max
A
bar 3.0 -0.530570 0.526860 -1.135632 -0.709248 -0.282863 -0.228039 -0.173215
foo 5.0 -0.150572 1.113308 -1.509059 -1.044236 0.119209 0.469112 1.212112

[2 rows x 8 columns]

In [10]: grouped.apply(lambda x: x.sort_values()[-2:]) # top 2 values
Out[10]:
A
bar 1 -0.282863
5 -0.173215
foo 0 0.469112
4 1.212112
Name: C, Length: 4, dtype: float64


.. _whatsnew_0.7.3.contributors:
Expand Down