Skip to content

Commit 33e5b7e

Browse files
authored
Merge branch 'main' into issue-50395
2 parents 89ae49e + 11d856f commit 33e5b7e

File tree

259 files changed

+3838
-3243
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

259 files changed

+3838
-3243
lines changed

.github/actions/setup-conda/action.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ runs:
1818
- name: Set Arrow version in ${{ inputs.environment-file }} to ${{ inputs.pyarrow-version }}
1919
run: |
2020
grep -q ' - pyarrow' ${{ inputs.environment-file }}
21-
sed -i"" -e "s/ - pyarrow/ - pyarrow=${{ inputs.pyarrow-version }}/" ${{ inputs.environment-file }}
21+
sed -i"" -e "s/ - pyarrow<11/ - pyarrow=${{ inputs.pyarrow-version }}/" ${{ inputs.environment-file }}
2222
cat ${{ inputs.environment-file }}
2323
shell: bash
2424
if: ${{ inputs.pyarrow-version }}

.pre-commit-config.yaml

+3-3
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ repos:
9292
args: [--disable=all, --enable=redefined-outer-name]
9393
stages: [manual]
9494
- repo: https://github.com/PyCQA/isort
95-
rev: 5.11.4
95+
rev: 5.12.0
9696
hooks:
9797
- id: isort
9898
- repo: https://github.com/asottile/pyupgrade
@@ -135,11 +135,11 @@ repos:
135135
types: [python]
136136
stages: [manual]
137137
additional_dependencies: &pyright_dependencies
138-
138+
139139
- id: pyright_reportGeneralTypeIssues
140140
# note: assumes python env is setup and activated
141141
name: pyright reportGeneralTypeIssues
142-
entry: pyright --skipunannotated -p pyright_reportGeneralTypeIssues.json
142+
entry: pyright --skipunannotated -p pyright_reportGeneralTypeIssues.json --level warning
143143
language: node
144144
pass_filenames: false
145145
types: [python]

LICENSE

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ BSD 3-Clause License
33
Copyright (c) 2008-2011, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
44
All rights reserved.
55

6-
Copyright (c) 2011-2022, Open source contributors.
6+
Copyright (c) 2011-2023, Open source contributors.
77

88
Redistribution and use in source and binary forms, with or without
99
modification, are permitted provided that the following conditions are met:

ci/code_checks.sh

+60-3
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
8383
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=EX04,GL01,GL02,GL03,GL04,GL05,GL06,GL07,GL09,GL10,PR03,PR04,PR05,PR06,PR08,PR09,PR10,RT01,RT02,RT04,RT05,SA02,SA03,SA04,SS01,SS02,SS03,SS04,SS05,SS06
8484
RET=$(($RET + $?)) ; echo $MSG "DONE"
8585

86-
MSG='Partially validate docstrings (EX01)' ; echo $MSG
86+
MSG='Partially validate docstrings (EX01)' ; echo $MSG
8787
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=EX01 --ignore_functions \
8888
pandas.Series.index \
8989
pandas.Series.dtype \
@@ -187,7 +187,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
187187
pandas.show_versions \
188188
pandas.test \
189189
pandas.NaT \
190-
pandas.Timestamp.unit \
191190
pandas.Timestamp.as_unit \
192191
pandas.Timestamp.ctime \
193192
pandas.Timestamp.date \
@@ -574,7 +573,65 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
574573
pandas.DataFrame.sparse.to_coo \
575574
pandas.DataFrame.to_gbq \
576575
pandas.DataFrame.style \
577-
pandas.DataFrame.__dataframe__ \
576+
pandas.DataFrame.__dataframe__
577+
RET=$(($RET + $?)) ; echo $MSG "DONE"
578+
579+
MSG='Partially validate docstrings (EX02)' ; echo $MSG
580+
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=EX02 --ignore_functions \
581+
pandas.DataFrame.plot.line \
582+
pandas.Index.factorize \
583+
pandas.Period.strftime \
584+
pandas.Series.factorize \
585+
pandas.Series.floordiv \
586+
pandas.Series.plot.line \
587+
pandas.Series.rfloordiv \
588+
pandas.Series.sparse.density \
589+
pandas.Series.sparse.npoints \
590+
pandas.Series.sparse.sp_values \
591+
pandas.Timestamp.fromtimestamp \
592+
pandas.api.types.infer_dtype \
593+
pandas.api.types.is_any_real_numeric_dtype \
594+
pandas.api.types.is_bool_dtype \
595+
pandas.api.types.is_categorical_dtype \
596+
pandas.api.types.is_complex_dtype \
597+
pandas.api.types.is_datetime64_any_dtype \
598+
pandas.api.types.is_datetime64_dtype \
599+
pandas.api.types.is_datetime64_ns_dtype \
600+
pandas.api.types.is_datetime64tz_dtype \
601+
pandas.api.types.is_float_dtype \
602+
pandas.api.types.is_int64_dtype \
603+
pandas.api.types.is_integer_dtype \
604+
pandas.api.types.is_interval_dtype \
605+
pandas.api.types.is_iterator \
606+
pandas.api.types.is_list_like \
607+
pandas.api.types.is_named_tuple \
608+
pandas.api.types.is_numeric_dtype \
609+
pandas.api.types.is_object_dtype \
610+
pandas.api.types.is_period_dtype \
611+
pandas.api.types.is_re \
612+
pandas.api.types.is_re_compilable \
613+
pandas.api.types.is_signed_integer_dtype \
614+
pandas.api.types.is_sparse \
615+
pandas.api.types.is_string_dtype \
616+
pandas.api.types.is_timedelta64_dtype \
617+
pandas.api.types.is_timedelta64_ns_dtype \
618+
pandas.api.types.is_unsigned_integer_dtype \
619+
pandas.core.groupby.DataFrameGroupBy.take \
620+
pandas.core.groupby.SeriesGroupBy.take \
621+
pandas.factorize \
622+
pandas.io.formats.style.Styler.concat \
623+
pandas.io.formats.style.Styler.export \
624+
pandas.io.formats.style.Styler.set_td_classes \
625+
pandas.io.formats.style.Styler.use \
626+
pandas.io.json.build_table_schema \
627+
pandas.merge_ordered \
628+
pandas.option_context \
629+
pandas.plotting.andrews_curves \
630+
pandas.plotting.autocorrelation_plot \
631+
pandas.plotting.lag_plot \
632+
pandas.plotting.parallel_coordinates \
633+
pandas.plotting.radviz \
634+
pandas.tseries.frequencies.to_offset
578635
RET=$(($RET + $?)) ; echo $MSG "DONE"
579636

580637
fi

ci/deps/actions-310.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ dependencies:
4242
- psycopg2
4343
- pymysql
4444
- pytables
45-
- pyarrow
45+
- pyarrow<11
4646
- pyreadstat
4747
- python-snappy
4848
- pyxlsb

ci/deps/actions-311.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ dependencies:
4242
- psycopg2
4343
- pymysql
4444
# - pytables>=3.8.0 # first version that supports 3.11
45-
- pyarrow
45+
- pyarrow<11
4646
- pyreadstat
4747
- python-snappy
4848
- pyxlsb

ci/deps/actions-38-downstream_compat.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ dependencies:
4040
- openpyxl
4141
- odfpy
4242
- psycopg2
43-
- pyarrow
43+
- pyarrow<11
4444
- pymysql
4545
- pyreadstat
4646
- pytables

ci/deps/actions-38-minimum_versions.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ dependencies:
4343
- openpyxl=3.0.7
4444
- pandas-gbq=0.15.0
4545
- psycopg2=2.8.6
46-
- pyarrow=6.0.0
46+
- pyarrow=7.0.0
4747
- pymysql=1.0.2
4848
- pyreadstat=1.1.2
4949
- pytables=3.6.1

ci/deps/actions-38.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ dependencies:
4040
- odfpy
4141
- pandas-gbq
4242
- psycopg2
43-
- pyarrow
43+
- pyarrow<11
4444
- pymysql
4545
- pyreadstat
4646
- pytables

ci/deps/actions-39.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ dependencies:
4141
- pandas-gbq
4242
- psycopg2
4343
- pymysql
44-
- pyarrow
44+
- pyarrow<11
4545
- pyreadstat
4646
- pytables
4747
- python-snappy

ci/deps/circle-38-arm64.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ dependencies:
4040
- odfpy
4141
- pandas-gbq
4242
- psycopg2
43-
- pyarrow
43+
- pyarrow<11
4444
- pymysql
4545
# Not provided on ARM
4646
#- pyreadstat

doc/source/development/contributing_codebase.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ contributing them to the project::
2525

2626
The script validates the doctests, formatting in docstrings, and
2727
imported modules. It is possible to run the checks independently by using the
28-
parameters ``docstring``, ``code``, and ``doctests``
28+
parameters ``docstrings``, ``code``, and ``doctests``
2929
(e.g. ``./ci/code_checks.sh doctests``).
3030

3131
In addition, because a lot of people use our library, it is important that we

doc/source/development/internals.rst

+23-26
Original file line numberDiff line numberDiff line change
@@ -15,24 +15,21 @@ Indexing
1515
In pandas there are a few objects implemented which can serve as valid
1616
containers for the axis labels:
1717

18-
* ``Index``: the generic "ordered set" object, an ndarray of object dtype
18+
* :class:`Index`: the generic "ordered set" object, an ndarray of object dtype
1919
assuming nothing about its contents. The labels must be hashable (and
2020
likely immutable) and unique. Populates a dict of label to location in
2121
Cython to do ``O(1)`` lookups.
22-
* ``Int64Index``: a version of ``Index`` highly optimized for 64-bit integer
23-
data, such as time stamps
24-
* ``Float64Index``: a version of ``Index`` highly optimized for 64-bit float data
25-
* ``MultiIndex``: the standard hierarchical index object
26-
* ``DatetimeIndex``: An Index object with ``Timestamp`` boxed elements (impl are the int64 values)
27-
* ``TimedeltaIndex``: An Index object with ``Timedelta`` boxed elements (impl are the in64 values)
28-
* ``PeriodIndex``: An Index object with Period elements
22+
* :class:`MultiIndex`: the standard hierarchical index object
23+
* :class:`DatetimeIndex`: An Index object with :class:`Timestamp` boxed elements (impl are the int64 values)
24+
* :class:`TimedeltaIndex`: An Index object with :class:`Timedelta` boxed elements (impl are the in64 values)
25+
* :class:`PeriodIndex`: An Index object with Period elements
2926

3027
There are functions that make the creation of a regular index easy:
3128

32-
* ``date_range``: fixed frequency date range generated from a time rule or
29+
* :func:`date_range`: fixed frequency date range generated from a time rule or
3330
DateOffset. An ndarray of Python datetime objects
34-
* ``period_range``: fixed frequency date range generated from a time rule or
35-
DateOffset. An ndarray of ``Period`` objects, representing timespans
31+
* :func:`period_range`: fixed frequency date range generated from a time rule or
32+
DateOffset. An ndarray of :class:`Period` objects, representing timespans
3633

3734
The motivation for having an ``Index`` class in the first place was to enable
3835
different implementations of indexing. This means that it's possible for you,
@@ -43,28 +40,28 @@ From an internal implementation point of view, the relevant methods that an
4340
``Index`` must define are one or more of the following (depending on how
4441
incompatible the new object internals are with the ``Index`` functions):
4542

46-
* ``get_loc``: returns an "indexer" (an integer, or in some cases a
43+
* :meth:`~Index.get_loc`: returns an "indexer" (an integer, or in some cases a
4744
slice object) for a label
48-
* ``slice_locs``: returns the "range" to slice between two labels
49-
* ``get_indexer``: Computes the indexing vector for reindexing / data
45+
* :meth:`~Index.slice_locs`: returns the "range" to slice between two labels
46+
* :meth:`~Index.get_indexer`: Computes the indexing vector for reindexing / data
5047
alignment purposes. See the source / docstrings for more on this
51-
* ``get_indexer_non_unique``: Computes the indexing vector for reindexing / data
48+
* :meth:`~Index.get_indexer_non_unique`: Computes the indexing vector for reindexing / data
5249
alignment purposes when the index is non-unique. See the source / docstrings
5350
for more on this
54-
* ``reindex``: Does any pre-conversion of the input index then calls
51+
* :meth:`~Index.reindex`: Does any pre-conversion of the input index then calls
5552
``get_indexer``
56-
* ``union``, ``intersection``: computes the union or intersection of two
53+
* :meth:`~Index.union`, :meth:`~Index.intersection`: computes the union or intersection of two
5754
Index objects
58-
* ``insert``: Inserts a new label into an Index, yielding a new object
59-
* ``delete``: Delete a label, yielding a new object
60-
* ``drop``: Deletes a set of labels
61-
* ``take``: Analogous to ndarray.take
55+
* :meth:`~Index.insert`: Inserts a new label into an Index, yielding a new object
56+
* :meth:`~Index.delete`: Delete a label, yielding a new object
57+
* :meth:`~Index.drop`: Deletes a set of labels
58+
* :meth:`~Index.take`: Analogous to ndarray.take
6259

6360
MultiIndex
6461
~~~~~~~~~~
6562

66-
Internally, the ``MultiIndex`` consists of a few things: the **levels**, the
67-
integer **codes** (until version 0.24 named *labels*), and the level **names**:
63+
Internally, the :class:`MultiIndex` consists of a few things: the **levels**, the
64+
integer **codes**, and the level **names**:
6865

6966
.. ipython:: python
7067
@@ -80,13 +77,13 @@ You can probably guess that the codes determine which unique element is
8077
identified with that location at each layer of the index. It's important to
8178
note that sortedness is determined **solely** from the integer codes and does
8279
not check (or care) whether the levels themselves are sorted. Fortunately, the
83-
constructors ``from_tuples`` and ``from_arrays`` ensure that this is true, but
84-
if you compute the levels and codes yourself, please be careful.
80+
constructors :meth:`~MultiIndex.from_tuples` and :meth:`~MultiIndex.from_arrays` ensure
81+
that this is true, but if you compute the levels and codes yourself, please be careful.
8582

8683
Values
8784
~~~~~~
8885

89-
pandas extends NumPy's type system with custom types, like ``Categorical`` or
86+
pandas extends NumPy's type system with custom types, like :class:`Categorical` or
9087
datetimes with a timezone, so we have multiple notions of "values". For 1-D
9188
containers (``Index`` classes and ``Series``) we have the following convention:
9289

doc/source/getting_started/comparison/includes/copies.rst

-10
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,3 @@ or overwrite the original one:
1111
.. code-block:: python
1212
1313
df = df.sort_values("col1")
14-
15-
.. note::
16-
17-
You will see an ``inplace=True`` keyword argument available for some methods:
18-
19-
.. code-block:: python
20-
21-
df.sort_values("col1", inplace=True)
22-
23-
Its use is discouraged. :ref:`More information. <indexing.view_versus_copy>`

doc/source/getting_started/install.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -441,7 +441,7 @@ PyTables 3.6.1 hdf5 HDF5-based reading
441441
blosc 1.21.0 hdf5 Compression for HDF5; only available on ``conda``
442442
zlib hdf5 Compression for HDF5
443443
fastparquet 0.6.3 - Parquet reading / writing (pyarrow is default)
444-
pyarrow 6.0.0 parquet, feather Parquet, ORC, and feather reading / writing
444+
pyarrow 7.0.0 parquet, feather Parquet, ORC, and feather reading / writing
445445
pyreadstat 1.1.2 spss SPSS files (.sav) reading
446446
odfpy 1.4.1 excel Open document format (.odf, .ods, .odt) reading / writing
447447
========================= ================== ================ =============================================================

doc/source/reference/arrays.rst

+1
Original file line numberDiff line numberDiff line change
@@ -653,6 +653,7 @@ Data type introspection
653653
.. autosummary::
654654
:toctree: api/
655655

656+
api.types.is_any_real_numeric_dtype
656657
api.types.is_bool_dtype
657658
api.types.is_categorical_dtype
658659
api.types.is_complex_dtype

doc/source/reference/frame.rst

+1
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ Binary operator functions
8383
.. autosummary::
8484
:toctree: api/
8585

86+
DataFrame.__add__
8687
DataFrame.add
8788
DataFrame.sub
8889
DataFrame.mul

0 commit comments

Comments
 (0)