Skip to content

Commit 6f6d325

Browse files
committed
Merge remote-tracking branch 'upstream/master' into docfix-multiindex-set_levels
2 parents 0b51bd5 + 6498bc1 commit 6f6d325

File tree

121 files changed

+1859
-1497
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+1859
-1497
lines changed

ci/azure/posix.yml

+6-7
Original file line numberDiff line numberDiff line change
@@ -9,17 +9,16 @@ jobs:
99
strategy:
1010
matrix:
1111
${{ if eq(parameters.name, 'macOS') }}:
12-
py35_macos:
13-
ENV_FILE: ci/deps/azure-macos-35.yaml
14-
CONDA_PY: "35"
12+
py36_macos:
13+
ENV_FILE: ci/deps/azure-macos-36.yaml
14+
CONDA_PY: "36"
1515
PATTERN: "not slow and not network"
1616

1717
${{ if eq(parameters.name, 'Linux') }}:
18-
py35_compat:
19-
ENV_FILE: ci/deps/azure-35-compat.yaml
20-
CONDA_PY: "35"
18+
py36_minimum_versions:
19+
ENV_FILE: ci/deps/azure-36-minimum_versions.yaml
20+
CONDA_PY: "36"
2121
PATTERN: "not slow and not network"
22-
2322
py36_locale_slow_old_np:
2423
ENV_FILE: ci/deps/azure-36-locale.yaml
2524
CONDA_PY: "36"

ci/deps/azure-35-compat.yaml renamed to ci/deps/azure-36-minimum_versions.yaml

+4-7
Original file line numberDiff line numberDiff line change
@@ -5,26 +5,23 @@ channels:
55
dependencies:
66
- beautifulsoup4=4.6.0
77
- bottleneck=1.2.1
8+
- cython>=0.29.13
89
- jinja2=2.8
910
- numexpr=2.6.2
1011
- numpy=1.13.3
1112
- openpyxl=2.4.8
1213
- pytables=3.4.2
1314
- python-dateutil=2.6.1
14-
- python=3.5.3
15+
- python=3.6.1
1516
- pytz=2017.2
1617
- scipy=0.19.0
1718
- xlrd=1.1.0
1819
- xlsxwriter=0.9.8
1920
- xlwt=1.2.0
2021
# universal
22+
- html5lib=1.0.1
2123
- hypothesis>=3.58.0
24+
- pytest=4.5.0
2225
- pytest-xdist
2326
- pytest-mock
2427
- pytest-azurepipelines
25-
- pip
26-
- pip:
27-
# for python 3.5, pytest>=4.0.2, cython>=0.29.13 is not available in conda
28-
- cython>=0.29.13
29-
- pytest==4.5.0
30-
- html5lib==1.0b2

ci/deps/azure-macos-35.yaml renamed to ci/deps/azure-macos-36.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ dependencies:
1414
- openpyxl
1515
- pyarrow
1616
- pytables
17-
- python=3.5.*
17+
- python=3.6.*
1818
- python-dateutil==2.6.1
1919
- pytz
2020
- xarray

doc/source/development/contributing.rst

+2-25
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ Creating a Python environment (pip)
236236
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
237237

238238
If you aren't using conda for your development environment, follow these instructions.
239-
You'll need to have at least python3.5 installed on your system.
239+
You'll need to have at least Python 3.6.1 installed on your system.
240240

241241
**Unix**/**Mac OS**
242242

@@ -847,29 +847,6 @@ The limitation here is that while a human can reasonably understand that ``is_nu
847847
848848
With custom types and inference this is not always possible so exceptions are made, but every effort should be exhausted to avoid ``cast`` before going down such paths.
849849

850-
Syntax Requirements
851-
~~~~~~~~~~~~~~~~~~~
852-
853-
Because *pandas* still supports Python 3.5, :pep:`526` does not apply and variables **must** be annotated with type comments. Specifically, this is a valid annotation within pandas:
854-
855-
.. code-block:: python
856-
857-
primes = [] # type: List[int]
858-
859-
Whereas this is **NOT** allowed:
860-
861-
.. code-block:: python
862-
863-
primes: List[int] = [] # not supported in Python 3.5!
864-
865-
Note that function signatures can always be annotated per :pep:`3107`:
866-
867-
.. code-block:: python
868-
869-
def sum_of_primes(primes: List[int] = []) -> int:
870-
...
871-
872-
873850
Pandas-specific Types
874851
~~~~~~~~~~~~~~~~~~~~~
875852

@@ -1296,7 +1273,7 @@ environment by::
12961273

12971274
or, to use a specific Python interpreter,::
12981275

1299-
asv run -e -E existing:python3.5
1276+
asv run -e -E existing:python3.6
13001277

13011278
This will display stderr from the benchmarks, and use your local
13021279
``python`` that comes from your ``$PATH``.

doc/source/development/policies.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ Pandas may change the behavior of experimental features at any time.
5151
Python Support
5252
~~~~~~~~~~~~~~
5353

54-
Pandas will only drop support for specific Python versions (e.g. 3.5.x, 3.6.x) in
54+
Pandas will only drop support for specific Python versions (e.g. 3.6.x, 3.7.x) in
5555
pandas **major** releases.
5656

5757
.. _SemVer: https://semver.org

doc/source/getting_started/dsintro.rst

-47
Original file line numberDiff line numberDiff line change
@@ -564,53 +564,6 @@ to a column created earlier in the same :meth:`~DataFrame.assign`.
564564
In the second expression, ``x['C']`` will refer to the newly created column,
565565
that's equal to ``dfa['A'] + dfa['B']``.
566566

567-
To write code compatible with all versions of Python, split the assignment in two.
568-
569-
.. ipython:: python
570-
571-
dependent = pd.DataFrame({"A": [1, 1, 1]})
572-
(dependent.assign(A=lambda x: x['A'] + 1)
573-
.assign(B=lambda x: x['A'] + 2))
574-
575-
.. warning::
576-
577-
Dependent assignment may subtly change the behavior of your code between
578-
Python 3.6 and older versions of Python.
579-
580-
If you wish to write code that supports versions of python before and after 3.6,
581-
you'll need to take care when passing ``assign`` expressions that
582-
583-
* Update an existing column
584-
* Refer to the newly updated column in the same ``assign``
585-
586-
For example, we'll update column "A" and then refer to it when creating "B".
587-
588-
.. code-block:: python
589-
590-
>>> dependent = pd.DataFrame({"A": [1, 1, 1]})
591-
>>> dependent.assign(A=lambda x: x["A"] + 1, B=lambda x: x["A"] + 2)
592-
593-
For Python 3.5 and earlier the expression creating ``B`` refers to the
594-
"old" value of ``A``, ``[1, 1, 1]``. The output is then
595-
596-
.. code-block:: console
597-
598-
A B
599-
0 2 3
600-
1 2 3
601-
2 2 3
602-
603-
For Python 3.6 and later, the expression creating ``A`` refers to the
604-
"new" value of ``A``, ``[2, 2, 2]``, which results in
605-
606-
.. code-block:: console
607-
608-
A B
609-
0 2 4
610-
1 2 4
611-
2 2 4
612-
613-
614567

615568
Indexing / selection
616569
~~~~~~~~~~~~~~~~~~~~

doc/source/getting_started/install.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Instructions for installing from source,
1818
Python version support
1919
----------------------
2020

21-
Officially Python 3.5.3 and above, 3.6, 3.7, and 3.8.
21+
Officially Python 3.6.1 and above, 3.7, and 3.8.
2222

2323
Installing pandas
2424
-----------------
@@ -140,7 +140,7 @@ Installing with ActivePython
140140
Installation instructions for
141141
`ActivePython <https://www.activestate.com/activepython>`__ can be found
142142
`here <https://www.activestate.com/activepython/downloads>`__. Versions
143-
2.7 and 3.5 include pandas.
143+
2.7, 3.5 and 3.6 include pandas.
144144

145145
Installing using your Linux distribution's package manager.
146146
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

doc/source/reference/window.rst

+2
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ Standard moving window functions
3434
Rolling.quantile
3535
Window.mean
3636
Window.sum
37+
Window.var
38+
Window.std
3739

3840
.. _api.functions_expanding:
3941

doc/source/user_guide/categorical.rst

+35-60
Original file line numberDiff line numberDiff line change
@@ -797,37 +797,52 @@ Assigning a ``Categorical`` to parts of a column of other types will use the val
797797
df.dtypes
798798
799799
.. _categorical.merge:
800+
.. _categorical.concat:
800801

801-
Merging
802-
~~~~~~~
802+
Merging / Concatenation
803+
~~~~~~~~~~~~~~~~~~~~~~~
803804

804-
You can concat two ``DataFrames`` containing categorical data together,
805-
but the categories of these categoricals need to be the same:
805+
By default, combining ``Series`` or ``DataFrames`` which contain the same
806+
categories results in ``category`` dtype, otherwise results will depend on the
807+
dtype of the underlying categories. Merges that result in non-categorical
808+
dtypes will likely have higher memory usage. Use ``.astype`` or
809+
``union_categoricals`` to ensure ``category`` results.
806810

807811
.. ipython:: python
808812
809-
cat = pd.Series(["a", "b"], dtype="category")
810-
vals = [1, 2]
811-
df = pd.DataFrame({"cats": cat, "vals": vals})
812-
res = pd.concat([df, df])
813-
res
814-
res.dtypes
813+
from pandas.api.types import union_categoricals
815814
816-
In this case the categories are not the same, and therefore an error is raised:
815+
# same categories
816+
s1 = pd.Series(['a', 'b'], dtype='category')
817+
s2 = pd.Series(['a', 'b', 'a'], dtype='category')
818+
pd.concat([s1, s2])
817819
818-
.. ipython:: python
820+
# different categories
821+
s3 = pd.Series(['b', 'c'], dtype='category')
822+
pd.concat([s1, s3])
819823
820-
df_different = df.copy()
821-
df_different["cats"].cat.categories = ["c", "d"]
822-
try:
823-
pd.concat([df, df_different])
824-
except ValueError as e:
825-
print("ValueError:", str(e))
824+
# Output dtype is inferred based on categories values
825+
int_cats = pd.Series([1, 2], dtype="category")
826+
float_cats = pd.Series([3.0, 4.0], dtype="category")
827+
pd.concat([int_cats, float_cats])
828+
829+
pd.concat([s1, s3]).astype('category')
830+
union_categoricals([s1.array, s3.array])
826831
827-
The same applies to ``df.append(df_different)``.
832+
The following table summarizes the results of merging ``Categoricals``:
828833

829-
See also the section on :ref:`merge dtypes<merging.dtypes>` for notes about preserving merge dtypes and performance.
834+
+-------------------+------------------------+----------------------+-----------------------------+
835+
| arg1 | arg2 | identical | result |
836+
+===================+========================+======================+=============================+
837+
| category | category | True | category |
838+
+-------------------+------------------------+----------------------+-----------------------------+
839+
| category (object) | category (object) | False | object (dtype is inferred) |
840+
+-------------------+------------------------+----------------------+-----------------------------+
841+
| category (int) | category (float) | False | float (dtype is inferred) |
842+
+-------------------+------------------------+----------------------+-----------------------------+
830843

844+
See also the section on :ref:`merge dtypes<merging.dtypes>` for notes about
845+
preserving merge dtypes and performance.
831846

832847
.. _categorical.union:
833848

@@ -918,46 +933,6 @@ the resulting array will always be a plain ``Categorical``:
918933
# "b" is coded to 0 throughout, same as c1, different from c2
919934
c.codes
920935
921-
.. _categorical.concat:
922-
923-
Concatenation
924-
~~~~~~~~~~~~~
925-
926-
This section describes concatenations specific to ``category`` dtype. See :ref:`Concatenating objects<merging.concat>` for general description.
927-
928-
By default, ``Series`` or ``DataFrame`` concatenation which contains the same categories
929-
results in ``category`` dtype, otherwise results in ``object`` dtype.
930-
Use ``.astype`` or ``union_categoricals`` to get ``category`` result.
931-
932-
.. ipython:: python
933-
934-
# same categories
935-
s1 = pd.Series(['a', 'b'], dtype='category')
936-
s2 = pd.Series(['a', 'b', 'a'], dtype='category')
937-
pd.concat([s1, s2])
938-
939-
# different categories
940-
s3 = pd.Series(['b', 'c'], dtype='category')
941-
pd.concat([s1, s3])
942-
943-
pd.concat([s1, s3]).astype('category')
944-
union_categoricals([s1.array, s3.array])
945-
946-
947-
Following table summarizes the results of ``Categoricals`` related concatenations.
948-
949-
+----------+--------------------------------------------------------+----------------------------+
950-
| arg1 | arg2 | result |
951-
+==========+========================================================+============================+
952-
| category | category (identical categories) | category |
953-
+----------+--------------------------------------------------------+----------------------------+
954-
| category | category (different categories, both not ordered) | object (dtype is inferred) |
955-
+----------+--------------------------------------------------------+----------------------------+
956-
| category | category (different categories, either one is ordered) | object (dtype is inferred) |
957-
+----------+--------------------------------------------------------+----------------------------+
958-
| category | not category | object (dtype is inferred) |
959-
+----------+--------------------------------------------------------+----------------------------+
960-
961936
962937
Getting data in/out
963938
-------------------

0 commit comments

Comments
 (0)