Skip to content

Commit 75fae8d

Browse files
committed
Merge branch 'master' of https://github.com/pandas-dev/pandas into cln-arith
2 parents 0303791 + a1e5304 commit 75fae8d

File tree

114 files changed

+2753
-2620
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

114 files changed

+2753
-2620
lines changed

.pre-commit-config.yaml

+5-2
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,12 @@ repos:
2121
- file
2222
args: [--append-config=flake8/cython-template.cfg]
2323
- repo: https://github.com/PyCQA/isort
24-
rev: 5.2.2
24+
rev: 5.6.0
2525
hooks:
2626
- id: isort
2727
exclude: ^pandas/__init__\.py$|^pandas/core/api\.py$
28+
files: '.pxd$|.py$'
29+
types: [file]
2830
- repo: https://github.com/asottile/pyupgrade
2931
rev: v2.7.2
3032
hooks:
@@ -39,10 +41,11 @@ repos:
3941
- id: pip_to_conda
4042
name: Generate pip dependency from conda
4143
description: This hook checks if the conda environment.yml and requirements-dev.txt are equal
42-
language: system
44+
language: python
4345
entry: python -m scripts.generate_pip_deps_from_conda
4446
files: ^(environment.yml|requirements-dev.txt)$
4547
pass_filenames: false
48+
additional_dependencies: [pyyaml]
4649
- repo: https://github.com/asottile/yesqa
4750
rev: v1.2.2
4851
hooks:

ci/deps/travis-37-cov.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -32,15 +32,15 @@ dependencies:
3232
- google-cloud-bigquery>=1.27.2 # GH 36436
3333
- psycopg2
3434
- pyarrow>=0.15.0
35-
- pymysql=0.7.11
35+
- pymysql<0.10.0 # temporary pin, GH 36465
3636
- pytables
3737
- python-snappy
3838
- python-dateutil
3939
- pytz
4040
- s3fs>=0.4.0
4141
- scikit-learn
4242
- scipy
43-
- sqlalchemy=1.3.0
43+
- sqlalchemy
4444
- statsmodels
4545
- xarray
4646
- xlrd

doc/source/development/contributing.rst

+7-7
Original file line numberDiff line numberDiff line change
@@ -1365,16 +1365,16 @@ environments. If you want to use virtualenv instead, write::
13651365
The ``-E virtualenv`` option should be added to all ``asv`` commands
13661366
that run benchmarks. The default value is defined in ``asv.conf.json``.
13671367

1368-
Running the full test suite can take up to one hour and use up to 3GB of RAM.
1369-
Usually it is sufficient to paste only a subset of the results into the pull
1370-
request to show that the committed changes do not cause unexpected performance
1371-
regressions. You can run specific benchmarks using the ``-b`` flag, which
1372-
takes a regular expression. For example, this will only run tests from a
1373-
``pandas/asv_bench/benchmarks/groupby.py`` file::
1368+
Running the full benchmark suite can be an all-day process, depending on your
1369+
hardware and its resource utilization. However, usually it is sufficient to paste
1370+
only a subset of the results into the pull request to show that the committed changes
1371+
do not cause unexpected performance regressions. You can run specific benchmarks
1372+
using the ``-b`` flag, which takes a regular expression. For example, this will
1373+
only run benchmarks from a ``pandas/asv_bench/benchmarks/groupby.py`` file::
13741374

13751375
asv continuous -f 1.1 upstream/master HEAD -b ^groupby
13761376

1377-
If you want to only run a specific group of tests from a file, you can do it
1377+
If you want to only run a specific group of benchmarks from a file, you can do it
13781378
using ``.`` as a separator. For example::
13791379

13801380
asv continuous -f 1.1 upstream/master HEAD -b groupby.GroupByMethods

doc/source/ecosystem.rst

+6
Original file line numberDiff line numberDiff line change
@@ -435,6 +435,11 @@ found in NumPy or pandas, which work well with pandas' data containers.
435435
Cyberpandas provides an extension type for storing arrays of IP Addresses. These
436436
arrays can be stored inside pandas' Series and DataFrame.
437437

438+
`Pandas-Genomics`_
439+
~~~~~~~~~~~~~~~~~~
440+
441+
Pandas-Genomics provides extension types and extension arrays for working with genomics data
442+
438443
`Pint-Pandas`_
439444
~~~~~~~~~~~~~~
440445

@@ -465,6 +470,7 @@ Library Accessor Classes Description
465470
.. _cyberpandas: https://cyberpandas.readthedocs.io/en/latest
466471
.. _pdvega: https://altair-viz.github.io/pdvega/
467472
.. _Altair: https://altair-viz.github.io/
473+
.. _pandas-genomics: https://pandas-genomics.readthedocs.io/en/latest/
468474
.. _pandas_path: https://github.com/drivendataorg/pandas-path/
469475
.. _pathlib.Path: https://docs.python.org/3/library/pathlib.html
470476
.. _pint-pandas: https://github.com/hgrecco/pint-pandas

doc/source/getting_started/intro_tutorials/03_subset_data.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,14 @@ This tutorial uses the Titanic data set, stored as CSV. The data
2727
consists of the following data columns:
2828

2929
- PassengerId: Id of every passenger.
30-
- Survived: This feature have value 0 and 1. 0 for not survived and 1
30+
- Survived: This feature has value 0 and 1. 0 for not survived and 1
3131
for survived.
3232
- Pclass: There are 3 classes: Class 1, Class 2 and Class 3.
3333
- Name: Name of passenger.
3434
- Sex: Gender of passenger.
3535
- Age: Age of passenger.
36-
- SibSp: Indication that passenger have siblings and spouse.
37-
- Parch: Whether a passenger is alone or have family.
36+
- SibSp: Indication that passengers have siblings and spouses.
37+
- Parch: Whether a passenger is alone or has a family.
3838
- Ticket: Ticket number of passenger.
3939
- Fare: Indicating the fare.
4040
- Cabin: The cabin of passenger.
@@ -199,7 +199,7 @@ selection brackets ``[]``. Only rows for which the value is ``True``
199199
will be selected.
200200

201201
We know from before that the original Titanic ``DataFrame`` consists of
202-
891 rows. Let’s have a look at the amount of rows which satisfy the
202+
891 rows. Let’s have a look at the number of rows which satisfy the
203203
condition by checking the ``shape`` attribute of the resulting
204204
``DataFrame`` ``above_35``:
205205

@@ -398,7 +398,7 @@ See the user guide section on :ref:`different choices for indexing <indexing.cho
398398
<div class="d-flex flex-row gs-torefguide">
399399
<span class="badge badge-info">To user guide</span>
400400

401-
A full overview about indexing is provided in the user guide pages on :ref:`indexing and selecting data <indexing>`.
401+
A full overview of indexing is provided in the user guide pages on :ref:`indexing and selecting data <indexing>`.
402402

403403
.. raw:: html
404404

doc/source/getting_started/intro_tutorials/04_plotting.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ I want each of the columns in a separate subplot.
167167
@savefig 04_airqual_area_subplot.png
168168
axs = air_quality.plot.area(figsize=(12, 4), subplots=True)
169169
170-
Separate subplots for each of the data columns is supported by the ``subplots`` argument
170+
Separate subplots for each of the data columns are supported by the ``subplots`` argument
171171
of the ``plot`` functions. The builtin options available in each of the pandas plot
172172
functions that are worthwhile to have a look.
173173

@@ -214,7 +214,7 @@ I want to further customize, extend or save the resulting plot.
214214
</li>
215215
</ul>
216216

217-
Each of the plot objects created by pandas are a
217+
Each of the plot objects created by pandas is a
218218
`matplotlib <https://matplotlib.org/>`__ object. As Matplotlib provides
219219
plenty of options to customize plots, making the link between pandas and
220220
Matplotlib explicit enables all the power of matplotlib to the plot.

doc/source/user_guide/indexing.rst

+18
Original file line numberDiff line numberDiff line change
@@ -933,6 +933,24 @@ and :ref:`Advanced Indexing <advanced>` you may select along more than one axis
933933
934934
df2.loc[criterion & (df2['b'] == 'x'), 'b':'c']
935935
936+
.. warning::
937+
938+
``iloc`` supports two kinds of boolean indexing. If the indexer is a boolean ``Series``,
939+
an error will be raised. For instance, in the following example, ``df.iloc[s.values, 1]`` is ok.
940+
The boolean indexer is an array. But ``df.iloc[s, 1]`` would raise ``ValueError``.
941+
942+
.. ipython:: python
943+
944+
df = pd.DataFrame([[1, 2], [3, 4], [5, 6]],
945+
index=list('abc'),
946+
columns=['A', 'B'])
947+
s = (df['A'] > 2)
948+
s
949+
950+
df.loc[s, 'B']
951+
952+
df.iloc[s.values, 1]
953+
936954
.. _indexing.basics.indexing_isin:
937955

938956
Indexing with isin

0 commit comments

Comments
 (0)