Skip to content

Commit 69d7d72

Browse files
authored
Merge branch 'main' into implementation-pdep-4
2 parents a7f20a2 + 9ac9b9c commit 69d7d72

File tree

13 files changed

+122
-74
lines changed

13 files changed

+122
-74
lines changed

.github/workflows/ubuntu.yml

-2
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,6 @@ jobs:
7777
- name: "Numpy Dev"
7878
env_file: actions-310-numpydev.yaml
7979
pattern: "not slow and not network and not single_cpu"
80-
pandas_testing_mode: "deprecate"
8180
test_args: "-W error::DeprecationWarning:numpy -W error::FutureWarning:numpy"
8281
exclude:
8382
- env_file: actions-39.yaml
@@ -96,7 +95,6 @@ jobs:
9695
EXTRA_APT: ${{ matrix.extra_apt || '' }}
9796
LANG: ${{ matrix.lang || '' }}
9897
LC_ALL: ${{ matrix.lc_all || '' }}
99-
PANDAS_TESTING_MODE: ${{ matrix.pandas_testing_mode || '' }}
10098
PANDAS_DATA_MANAGER: ${{ matrix.pandas_data_manager || 'block' }}
10199
PANDAS_COPY_ON_WRITE: ${{ matrix.pandas_copy_on_write || '0' }}
102100
TEST_ARGS: ${{ matrix.test_args || '' }}

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
[![Coverage](https://codecov.io/github/pandas-dev/pandas/coverage.svg?branch=main)](https://codecov.io/gh/pandas-dev/pandas)
1414
[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/pandas-dev/pandas/badge)](https://api.securityscorecards.dev/projects/github.com/pandas-dev/pandas)
1515
[![Downloads](https://static.pepy.tech/personalized-badge/pandas?period=month&units=international_system&left_color=black&right_color=orange&left_text=PyPI%20downloads%20per%20month)](https://pepy.tech/project/pandas)
16-
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/pydata/pandas)
16+
[![Slack](https://img.shields.io/badge/join_Slack-information-brightgreen.svg?logo=slack)](https://pandas.pydata.org/docs/dev/development/community.html?highlight=slack#community-slack)
1717
[![Powered by NumFOCUS](https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A)](https://numfocus.org)
1818
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
1919
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
@@ -152,7 +152,7 @@ For usage questions, the best place to go to is [StackOverflow](https://stackove
152152
Further, general questions and discussions can also take place on the [pydata mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata).
153153

154154
## Discussion and Development
155-
Most development discussions take place on GitHub in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Gitter channel](https://gitter.im/pydata/pandas) is available for quick development related questions.
155+
Most development discussions take place on GitHub in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Slack channel](https://pandas.pydata.org/docs/dev/development/community.html?highlight=slack#community-slack) is available for quick development related questions.
156156

157157
## Contributing to pandas [![Open Source Helpers](https://www.codetriage.com/pandas-dev/pandas/badges/users.svg)](https://www.codetriage.com/pandas-dev/pandas)
158158

doc/source/whatsnew/v1.5.1.rst

+1
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@ Fixed regressions
8888
- Fixed regression in :meth:`Series.groupby` and :meth:`DataFrame.groupby` when the grouper is a nullable data type (e.g. :class:`Int64`) or a PyArrow-backed string array, contains null values, and ``dropna=False`` (:issue:`48794`)
8989
- Fixed regression in :meth:`DataFrame.to_parquet` raising when file name was specified as ``bytes`` (:issue:`48944`)
9090
- Fixed regression in :class:`ExcelWriter` where the ``book`` attribute could no longer be set; however setting this attribute is now deprecated and this ability will be removed in a future version of pandas (:issue:`48780`)
91+
- Fixed regression in :meth:`DataFrame.corrwith` when computing correlation on tied data with ``method="spearman"`` (:issue:`48826`)
9192

9293
.. ---------------------------------------------------------------------------
9394

doc/source/whatsnew/v2.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,7 @@ Interval
253253
Indexing
254254
^^^^^^^^
255255
- Bug in :meth:`DataFrame.reindex` filling with wrong values when indexing columns and index for ``uint`` dtypes (:issue:`48184`)
256+
- Bug in :meth:`DataFrame.__setitem__` raising ``ValueError`` when right hand side is :class:`DataFrame` with :class:`MultiIndex` columns (:issue:`49121`)
256257
- Bug in :meth:`DataFrame.reindex` casting dtype to ``object`` when :class:`DataFrame` has single extension array column when re-indexing ``columns`` and ``index`` (:issue:`48190`)
257258
- Bug in :func:`~DataFrame.describe` when formatting percentiles in the resulting index showed more decimals than needed (:issue:`46362`)
258259
- Bug in :meth:`DataFrame.compare` does not recognize differences when comparing ``NA`` with value in nullable dtypes (:issue:`48939`)

pandas/_libs/parsers.pyx

+3
Original file line numberDiff line numberDiff line change
@@ -1414,6 +1414,9 @@ def _maybe_upcast(arr, use_nullable_dtypes: bool = False):
14141414
-------
14151415
The casted array.
14161416
"""
1417+
if is_extension_array_dtype(arr.dtype):
1418+
return arr
1419+
14171420
na_value = na_values[arr.dtype]
14181421

14191422
if issubclass(arr.dtype.type, np.integer):

pandas/_testing/__init__.py

-25
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@
1515
Counter,
1616
Iterable,
1717
)
18-
import warnings
1918

2019
import numpy as np
2120

@@ -236,28 +235,6 @@
236235

237236
EMPTY_STRING_PATTERN = re.compile("^$")
238237

239-
# set testing_mode
240-
_testing_mode_warnings = (DeprecationWarning, ResourceWarning)
241-
242-
243-
def set_testing_mode() -> None:
244-
# set the testing mode filters
245-
testing_mode = os.environ.get("PANDAS_TESTING_MODE", "None")
246-
if "deprecate" in testing_mode:
247-
for category in _testing_mode_warnings:
248-
warnings.simplefilter("always", category)
249-
250-
251-
def reset_testing_mode() -> None:
252-
# reset the testing mode filters
253-
testing_mode = os.environ.get("PANDAS_TESTING_MODE", "None")
254-
if "deprecate" in testing_mode:
255-
for category in _testing_mode_warnings:
256-
warnings.simplefilter("ignore", category)
257-
258-
259-
set_testing_mode()
260-
261238

262239
def reset_display_options() -> None:
263240
"""
@@ -1142,14 +1119,12 @@ def shares_memory(left, right) -> bool:
11421119
"randbool",
11431120
"rands",
11441121
"reset_display_options",
1145-
"reset_testing_mode",
11461122
"RNGContext",
11471123
"round_trip_localpath",
11481124
"round_trip_pathlib",
11491125
"round_trip_pickle",
11501126
"setitem",
11511127
"set_locale",
1152-
"set_testing_mode",
11531128
"set_timezone",
11541129
"shares_memory",
11551130
"SIGNED_INT_EA_DTYPES",

pandas/core/frame.py

+2-34
Original file line numberDiff line numberDiff line change
@@ -4110,7 +4110,7 @@ def _set_item_frame_value(self, key, value: DataFrame) -> None:
41104110
if key in self.columns:
41114111
loc = self.columns.get_loc(key)
41124112
cols = self.columns[loc]
4113-
len_cols = 1 if is_scalar(cols) else len(cols)
4113+
len_cols = 1 if is_scalar(cols) or isinstance(cols, tuple) else len(cols)
41144114
if len_cols != len(value.columns):
41154115
raise ValueError("Columns must be same length as key")
41164116

@@ -10577,40 +10577,8 @@ def corrwith(
1057710577
if numeric_only is lib.no_default and len(this.columns) < len(self.columns):
1057810578
com.deprecate_numeric_only_default(type(self), "corrwith")
1057910579

10580-
# GH46174: when other is a Series object and axis=0, we achieve a speedup over
10581-
# passing .corr() to .apply() by taking the columns as ndarrays and iterating
10582-
# over the transposition row-wise. Then we delegate the correlation coefficient
10583-
# computation and null-masking to np.corrcoef and np.isnan respectively,
10584-
# which are much faster. We exploit the fact that the Spearman correlation
10585-
# of two vectors is equal to the Pearson correlation of their ranks to use
10586-
# substantially the same method for Pearson and Spearman,
10587-
# just with intermediate argsorts on the latter.
1058810580
if isinstance(other, Series):
10589-
if axis == 0 and method in ["pearson", "spearman"]:
10590-
corrs = {}
10591-
if numeric_only:
10592-
cols = self.select_dtypes(include=np.number).columns
10593-
ndf = self[cols].values.transpose()
10594-
else:
10595-
cols = self.columns
10596-
ndf = self.values.transpose()
10597-
k = other.values
10598-
if method == "pearson":
10599-
for i, r in enumerate(ndf):
10600-
nonnull_mask = ~np.isnan(r) & ~np.isnan(k)
10601-
corrs[cols[i]] = np.corrcoef(r[nonnull_mask], k[nonnull_mask])[
10602-
0, 1
10603-
]
10604-
else:
10605-
for i, r in enumerate(ndf):
10606-
nonnull_mask = ~np.isnan(r) & ~np.isnan(k)
10607-
corrs[cols[i]] = np.corrcoef(
10608-
r[nonnull_mask].argsort().argsort(),
10609-
k[nonnull_mask].argsort().argsort(),
10610-
)[0, 1]
10611-
return Series(corrs)
10612-
else:
10613-
return this.apply(lambda x: other.corr(x, method=method), axis=axis)
10581+
return this.apply(lambda x: other.corr(x, method=method), axis=axis)
1061410582

1061510583
if numeric_only_bool:
1061610584
other = other._get_numeric_data()

pandas/tests/frame/indexing/test_setitem.py

+8
Original file line numberDiff line numberDiff line change
@@ -748,6 +748,14 @@ def test_setitem_frame_overwrite_with_ea_dtype(self, any_numeric_ea_dtype):
748748
)
749749
tm.assert_frame_equal(df, expected)
750750

751+
def test_setitem_frame_midx_columns(self):
752+
# GH#49121
753+
df = DataFrame({("a", "b"): [10]})
754+
expected = df.copy()
755+
col_name = ("a", "b")
756+
df[col_name] = df[[col_name]]
757+
tm.assert_frame_equal(df, expected)
758+
751759

752760
class TestSetitemTZAwareValues:
753761
@pytest.fixture

pandas/tests/frame/methods/test_cov_corr.py

+27-1
Original file line numberDiff line numberDiff line change
@@ -355,7 +355,10 @@ def test_corrwith_mixed_dtypes(self, numeric_only):
355355
expected = Series(data=corrs, index=["a", "b"])
356356
tm.assert_series_equal(result, expected)
357357
else:
358-
with pytest.raises(TypeError, match="not supported for the input types"):
358+
with pytest.raises(
359+
TypeError,
360+
match=r"unsupported operand type\(s\) for /: 'str' and 'int'",
361+
):
359362
df.corrwith(s, numeric_only=numeric_only)
360363

361364
def test_corrwith_index_intersection(self):
@@ -406,3 +409,26 @@ def test_corrwith_kendall(self):
406409
result = df.corrwith(df**2, method="kendall")
407410
expected = Series(np.ones(len(result)))
408411
tm.assert_series_equal(result, expected)
412+
413+
@td.skip_if_no_scipy
414+
def test_corrwith_spearman_with_tied_data(self):
415+
# GH#48826
416+
df1 = DataFrame(
417+
{
418+
"A": [1, np.nan, 7, 8],
419+
"B": [False, True, True, False],
420+
"C": [10, 4, 9, 3],
421+
}
422+
)
423+
df2 = df1[["B", "C"]]
424+
result = (df1 + 1).corrwith(df2.B, method="spearman")
425+
expected = Series([0.0, 1.0, 0.0], index=["A", "B", "C"])
426+
tm.assert_series_equal(result, expected)
427+
428+
df_bool = DataFrame(
429+
{"A": [True, True, False, False], "B": [True, False, False, True]}
430+
)
431+
ser_bool = Series([True, True, False, True])
432+
result = df_bool.corrwith(ser_bool)
433+
expected = Series([0.57735, 0.57735], index=["A", "B"])
434+
tm.assert_series_equal(result, expected)

pandas/tests/io/parser/dtypes/test_dtypes_basic.py

+11
Original file line numberDiff line numberDiff line change
@@ -466,3 +466,14 @@ def test_use_nullabla_dtypes_string(all_parsers, storage):
466466
}
467467
)
468468
tm.assert_frame_equal(result, expected)
469+
470+
471+
def test_use_nullable_dtypes_ea_dtype_specified(all_parsers):
472+
# GH#491496
473+
data = """a,b
474+
1,2
475+
"""
476+
parser = all_parsers
477+
result = parser.read_csv(StringIO(data), dtype="Int64", use_nullable_dtypes=True)
478+
expected = DataFrame({"a": [1], "b": 2}, dtype="Int64")
479+
tm.assert_frame_equal(result, expected)

pandas/tests/io/pytables/conftest.py

-10
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,8 @@
22

33
import pytest
44

5-
import pandas._testing as tm
6-
75

86
@pytest.fixture
97
def setup_path():
108
"""Fixture for setup path"""
119
return f"tmp.__{uuid.uuid4()}__.h5"
12-
13-
14-
@pytest.fixture(scope="module", autouse=True)
15-
def setup_mode():
16-
"""Reset testing mode fixture"""
17-
tm.reset_testing_mode()
18-
yield
19-
tm.set_testing_mode()

scripts/generate_pxi.py

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
import argparse
2+
import os
3+
4+
from Cython import Tempita
5+
6+
7+
def process_tempita(pxifile, outfile):
8+
with open(pxifile) as f:
9+
tmpl = f.read()
10+
pyxcontent = Tempita.sub(tmpl)
11+
12+
with open(outfile, "w") as f:
13+
f.write(pyxcontent)
14+
15+
16+
def main():
17+
parser = argparse.ArgumentParser()
18+
parser.add_argument("infile", type=str, help="Path to the input file")
19+
parser.add_argument("-o", "--outdir", type=str, help="Path to the output directory")
20+
args = parser.parse_args()
21+
22+
if not args.infile.endswith(".in"):
23+
raise ValueError(f"Unexpected extension: {args.infile}")
24+
25+
outdir_abs = os.path.join(os.getcwd(), args.outdir)
26+
outfile = os.path.join(
27+
outdir_abs, os.path.splitext(os.path.split(args.infile)[1])[0]
28+
)
29+
30+
process_tempita(args.infile, outfile)
31+
32+
33+
main()

scripts/generate_version.py

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
import argparse
2+
import os
3+
4+
import versioneer
5+
6+
7+
def write_version_info(path):
8+
if os.environ.get("MESON_DIST_ROOT"):
9+
# raise ValueError("dist root is", os.environ.get("MESON_DIST_ROOT"))
10+
path = os.path.join(os.environ.get("MESON_DIST_ROOT"), path)
11+
with open(path, "w") as file:
12+
file.write(f'__version__="{versioneer.get_version()}"\n')
13+
file.write(
14+
f'__git_version__="{versioneer.get_versions()["full-revisionid"]}"\n'
15+
)
16+
17+
18+
def main():
19+
parser = argparse.ArgumentParser()
20+
parser.add_argument(
21+
"-o", "--outfile", type=str, help="Path to write version info to"
22+
)
23+
args = parser.parse_args()
24+
25+
if not args.outfile.endswith(".py"):
26+
raise ValueError(
27+
f"Output file must be a Python file. "
28+
f"Got: {args.outfile} as filename instead"
29+
)
30+
31+
write_version_info(args.outfile)
32+
33+
34+
main()

0 commit comments

Comments
 (0)