Skip to content

Commit 15a9a57

Browse files
authored
Merge branch 'pandas-dev:main' into issue2
2 parents 580773d + c711be0 commit 15a9a57

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+660
-91
lines changed

.github/workflows/32-bit-linux.yml

+3
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ on:
1212
paths-ignore:
1313
- "doc/**"
1414

15+
permissions:
16+
contents: read
17+
1518
jobs:
1619
pytest:
1720
runs-on: ubuntu-latest

.github/workflows/assign.yml

+6
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,14 @@ on:
33
issue_comment:
44
types: created
55

6+
permissions:
7+
contents: read
8+
69
jobs:
710
issue_assign:
11+
permissions:
12+
issues: write
13+
pull-requests: write
814
runs-on: ubuntu-latest
915
steps:
1016
- if: github.event.comment.body == 'take'

.github/workflows/asv-bot.yml

+7
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,15 @@ env:
99
ENV_FILE: environment.yml
1010
COMMENT: ${{github.event.comment.body}}
1111

12+
permissions:
13+
contents: read
14+
1215
jobs:
1316
autotune:
17+
permissions:
18+
contents: read
19+
issues: write
20+
pull-requests: write
1421
name: "Run benchmarks"
1522
# TODO: Support more benchmarking options later, against different branches, against self, etc
1623
if: startsWith(github.event.comment.body, '@github-actions benchmark')

.github/workflows/autoupdate-pre-commit-config.yml

+6
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,14 @@ on:
55
- cron: "0 7 1 * *" # At 07:00 on 1st of every month.
66
workflow_dispatch:
77

8+
permissions:
9+
contents: read
10+
811
jobs:
912
update-pre-commit:
13+
permissions:
14+
contents: write # for technote-space/create-pr-action to push code
15+
pull-requests: write # for technote-space/create-pr-action to create a PR
1016
if: github.repository_owner == 'pandas-dev'
1117
name: Autoupdate pre-commit config
1218
runs-on: ubuntu-latest

.github/workflows/code-checks.yml

+3
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ env:
1414
ENV_FILE: environment.yml
1515
PANDAS_CI: 1
1616

17+
permissions:
18+
contents: read
19+
1720
jobs:
1821
pre_commit:
1922
name: pre-commit

.github/workflows/docbuild-and-upload.yml

+3
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ env:
1414
ENV_FILE: environment.yml
1515
PANDAS_CI: 1
1616

17+
permissions:
18+
contents: read
19+
1720
jobs:
1821
web_and_docs:
1922
name: Doc Build and Upload

.github/workflows/macos-windows.yml

+3
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ env:
1818
PATTERN: "not slow and not db and not network and not single_cpu"
1919

2020

21+
permissions:
22+
contents: read
23+
2124
jobs:
2225
pytest:
2326
defaults:

.github/workflows/python-dev.yml

+3
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@ env:
2727
COVERAGE: true
2828
PYTEST_TARGET: pandas
2929

30+
permissions:
31+
contents: read
32+
3033
jobs:
3134
build:
3235
if: false # Comment this line out to "unfreeze"

.github/workflows/sdist.yml

+3
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ on:
1313
paths-ignore:
1414
- "doc/**"
1515

16+
permissions:
17+
contents: read
18+
1619
jobs:
1720
build:
1821
if: ${{ github.event.label.name == 'Build' || contains(github.event.pull_request.labels.*.name, 'Build') || github.event_name == 'push'}}

.github/workflows/stale-pr.yml

+5
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,13 @@ on:
44
# * is a special character in YAML so you have to quote this string
55
- cron: "0 0 * * *"
66

7+
permissions:
8+
contents: read
9+
710
jobs:
811
stale:
12+
permissions:
13+
pull-requests: write
914
runs-on: ubuntu-latest
1015
steps:
1116
- uses: actions/stale@v4

.github/workflows/ubuntu.yml

+3
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@ on:
1515
env:
1616
PANDAS_CI: 1
1717

18+
permissions:
19+
contents: read
20+
1821
jobs:
1922
pytest:
2023
runs-on: ubuntu-latest

.pre-commit-config.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ repos:
9393
types: [python]
9494
stages: [manual]
9595
additional_dependencies: &pyright_dependencies
96-
96+
9797
- repo: local
9898
hooks:
9999
- id: pyright_reportGeneralTypeIssues

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -169,4 +169,4 @@ Or maybe through using pandas you have an idea of your own or are looking for so
169169

170170
Feel free to ask questions on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas).
171171

172-
As contributors and maintainers to this project, you are expected to abide by pandas' code of conduct. More information can be found at: [Contributor Code of Conduct](https://github.com/pandas-dev/pandas/blob/main/.github/CODE_OF_CONDUCT.md)
172+
As contributors and maintainers to this project, you are expected to abide by pandas' code of conduct. More information can be found at: [Contributor Code of Conduct](https://github.com/pandas-dev/.github/blob/master/CODE_OF_CONDUCT.md)

doc/source/reference/testing.rst

+4
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,14 @@ Exceptions and warnings
2626

2727
errors.AbstractMethodError
2828
errors.AccessorRegistrationWarning
29+
errors.AttributeConflictWarning
30+
errors.ClosedFileError
2931
errors.CSSWarning
3032
errors.DataError
3133
errors.DtypeWarning
3234
errors.DuplicateLabelError
3335
errors.EmptyDataError
36+
errors.IncompatibilityWarning
3437
errors.IndexingError
3538
errors.InvalidIndexError
3639
errors.IntCastingNaNError
@@ -44,6 +47,7 @@ Exceptions and warnings
4447
errors.ParserError
4548
errors.ParserWarning
4649
errors.PerformanceWarning
50+
errors.PossibleDataLossError
4751
errors.PyperclipException
4852
errors.PyperclipWindowsException
4953
errors.SettingWithCopyError

doc/source/whatsnew/v1.5.0.rst

+2
Original file line numberDiff line numberDiff line change
@@ -1001,6 +1001,8 @@ Groupby/resample/rolling
10011001
- Bug in :meth:`.DataFrameGroupBy.describe` and :meth:`.SeriesGroupBy.describe` produces inconsistent results for empty datasets (:issue:`41575`)
10021002
- Bug in :meth:`DataFrame.resample` reduction methods when used with ``on`` would attempt to aggregate the provided column (:issue:`47079`)
10031003
- Bug in :meth:`DataFrame.groupby` and :meth:`Series.groupby` would not respect ``dropna=False`` when the input DataFrame/Series had a NaN values in a :class:`MultiIndex` (:issue:`46783`)
1004+
- Bug in :meth:`DataFrameGroupBy.resample` raises ``KeyError`` when getting the result from a key list which misses the resample key (:issue:`47362`)
1005+
-
10041006

10051007
Reshaping
10061008
^^^^^^^^^

pandas/_libs/tslib.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ def format_array_from_datetime(
152152
# a format based on precision
153153
basic_format = format is None
154154
if basic_format:
155-
reso_obj = get_resolution(values, reso=reso)
155+
reso_obj = get_resolution(values, tz=tz, reso=reso)
156156
show_ns = reso_obj == Resolution.RESO_NS
157157
show_us = reso_obj == Resolution.RESO_US
158158
show_ms = reso_obj == Resolution.RESO_MS

pandas/_libs/tslibs/conversion.pyx

+4
Original file line numberDiff line numberDiff line change
@@ -144,9 +144,13 @@ cpdef inline (int64_t, int) precision_from_unit(str unit):
144144
NPY_DATETIMEUNIT reso = abbrev_to_npy_unit(unit)
145145

146146
if reso == NPY_DATETIMEUNIT.NPY_FR_Y:
147+
# each 400 years we have 97 leap years, for an average of 97/400=.2425
148+
# extra days each year. We get 31556952 by writing
149+
# 3600*24*365.2425=31556952
147150
m = 1_000_000_000 * 31556952
148151
p = 9
149152
elif reso == NPY_DATETIMEUNIT.NPY_FR_M:
153+
# 2629746 comes from dividing the "Y" case by 12.
150154
m = 1_000_000_000 * 2629746
151155
p = 9
152156
elif reso == NPY_DATETIMEUNIT.NPY_FR_W:

pandas/_libs/tslibs/vectorized.pyi

+1-1
Original file line numberDiff line numberDiff line change
@@ -42,5 +42,5 @@ def ints_to_pydatetime(
4242
def tz_convert_from_utc(
4343
stamps: npt.NDArray[np.int64],
4444
tz: tzinfo | None,
45-
reso: int = ...,
45+
reso: int = ..., # NPY_DATETIMEUNIT
4646
) -> npt.NDArray[np.int64]: ...

pandas/core/arrays/datetimelike.py

+10-3
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,6 @@
7373
from pandas.util._exceptions import find_stack_level
7474

7575
from pandas.core.dtypes.common import (
76-
DT64NS_DTYPE,
7776
is_all_strings,
7877
is_categorical_dtype,
7978
is_datetime64_any_dtype,
@@ -1103,6 +1102,7 @@ def _add_datetimelike_scalar(self, other) -> DatetimeArray:
11031102
self = cast("TimedeltaArray", self)
11041103

11051104
from pandas.core.arrays import DatetimeArray
1105+
from pandas.core.arrays.datetimes import tz_to_dtype
11061106

11071107
assert other is not NaT
11081108
other = Timestamp(other)
@@ -1113,10 +1113,17 @@ def _add_datetimelike_scalar(self, other) -> DatetimeArray:
11131113
# Preserve our resolution
11141114
return DatetimeArray._simple_new(result, dtype=result.dtype)
11151115

1116+
if self._reso != other._reso:
1117+
raise NotImplementedError(
1118+
"Addition between TimedeltaArray and Timestamp with mis-matched "
1119+
"resolutions is not yet supported."
1120+
)
1121+
11161122
i8 = self.asi8
11171123
result = checked_add_with_arr(i8, other.value, arr_mask=self._isnan)
1118-
dtype = DatetimeTZDtype(tz=other.tz) if other.tz else DT64NS_DTYPE
1119-
return DatetimeArray(result, dtype=dtype, freq=self.freq)
1124+
dtype = tz_to_dtype(tz=other.tz, unit=self._unit)
1125+
res_values = result.view(f"M8[{self._unit}]")
1126+
return DatetimeArray._simple_new(res_values, dtype=dtype, freq=self.freq)
11201127

11211128
@final
11221129
def _add_datetime_arraylike(self, other) -> DatetimeArray:

pandas/core/arrays/datetimes.py

+12-7
Original file line numberDiff line numberDiff line change
@@ -91,22 +91,23 @@
9191
_midnight = time(0, 0)
9292

9393

94-
def tz_to_dtype(tz):
94+
def tz_to_dtype(tz: tzinfo | None, unit: str = "ns"):
9595
"""
9696
Return a datetime64[ns] dtype appropriate for the given timezone.
9797
9898
Parameters
9999
----------
100100
tz : tzinfo or None
101+
unit : str, default "ns"
101102
102103
Returns
103104
-------
104105
np.dtype or Datetime64TZDType
105106
"""
106107
if tz is None:
107-
return DT64NS_DTYPE
108+
return np.dtype(f"M8[{unit}]")
108109
else:
109-
return DatetimeTZDtype(tz=tz)
110+
return DatetimeTZDtype(tz=tz, unit=unit)
110111

111112

112113
def _field_accessor(name: str, field: str, docstring=None):
@@ -800,7 +801,7 @@ def tz_convert(self, tz) -> DatetimeArray:
800801
)
801802

802803
# No conversion since timestamps are all UTC to begin with
803-
dtype = tz_to_dtype(tz)
804+
dtype = tz_to_dtype(tz, unit=self._unit)
804805
return self._simple_new(self._ndarray, dtype=dtype, freq=self.freq)
805806

806807
@dtl.ravel_compat
@@ -965,10 +966,14 @@ def tz_localize(self, tz, ambiguous="raise", nonexistent="raise") -> DatetimeArr
965966
# Convert to UTC
966967

967968
new_dates = tzconversion.tz_localize_to_utc(
968-
self.asi8, tz, ambiguous=ambiguous, nonexistent=nonexistent
969+
self.asi8,
970+
tz,
971+
ambiguous=ambiguous,
972+
nonexistent=nonexistent,
973+
reso=self._reso,
969974
)
970-
new_dates = new_dates.view(DT64NS_DTYPE)
971-
dtype = tz_to_dtype(tz)
975+
new_dates = new_dates.view(f"M8[{self._unit}]")
976+
dtype = tz_to_dtype(tz, unit=self._unit)
972977

973978
freq = None
974979
if timezones.is_utc(tz) or (len(self) == 1 and not isna(new_dates[0])):

pandas/core/arrays/string_.py

+11-10
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,11 @@ class StringDtype(StorageExtensionDtype):
9090

9191
name = "string"
9292

93-
#: StringDtype.na_value uses pandas.NA
94-
na_value = libmissing.NA
93+
#: StringDtype().na_value uses pandas.NA
94+
@property
95+
def na_value(self) -> libmissing.NAType:
96+
return libmissing.NA
97+
9598
_metadata = ("storage",)
9699

97100
def __init__(self, storage=None) -> None:
@@ -335,13 +338,11 @@ def _from_sequence(cls, scalars, *, dtype: Dtype | None = None, copy=False):
335338
na_values = scalars._mask
336339
result = scalars._data
337340
result = lib.ensure_string_array(result, copy=copy, convert_na_value=False)
338-
result[na_values] = StringDtype.na_value
341+
result[na_values] = libmissing.NA
339342

340343
else:
341-
# convert non-na-likes to str, and nan-likes to StringDtype.na_value
342-
result = lib.ensure_string_array(
343-
scalars, na_value=StringDtype.na_value, copy=copy
344-
)
344+
# convert non-na-likes to str, and nan-likes to StringDtype().na_value
345+
result = lib.ensure_string_array(scalars, na_value=libmissing.NA, copy=copy)
345346

346347
# Manually creating new array avoids the validation step in the __init__, so is
347348
# faster. Refactor need for validation?
@@ -396,7 +397,7 @@ def __setitem__(self, key, value):
396397
# validate new items
397398
if scalar_value:
398399
if isna(value):
399-
value = StringDtype.na_value
400+
value = libmissing.NA
400401
elif not isinstance(value, str):
401402
raise ValueError(
402403
f"Cannot set non-string value '{value}' into a StringArray."
@@ -497,7 +498,7 @@ def _cmp_method(self, other, op):
497498

498499
if op.__name__ in ops.ARITHMETIC_BINOPS:
499500
result = np.empty_like(self._ndarray, dtype="object")
500-
result[mask] = StringDtype.na_value
501+
result[mask] = libmissing.NA
501502
result[valid] = op(self._ndarray[valid], other)
502503
return StringArray(result)
503504
else:
@@ -512,7 +513,7 @@ def _cmp_method(self, other, op):
512513
# String methods interface
513514
# error: Incompatible types in assignment (expression has type "NAType",
514515
# base class "PandasArray" defined the type as "float")
515-
_str_na_value = StringDtype.na_value # type: ignore[assignment]
516+
_str_na_value = libmissing.NA # type: ignore[assignment]
516517

517518
def _str_map(
518519
self, f, na_value=None, dtype: Dtype | None = None, convert: bool = True

pandas/core/arrays/string_arrow.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -242,8 +242,9 @@ def astype(self, dtype, copy: bool = True):
242242
# ------------------------------------------------------------------------
243243
# String methods interface
244244

245-
# error: Cannot determine type of 'na_value'
246-
_str_na_value = StringDtype.na_value # type: ignore[has-type]
245+
# error: Incompatible types in assignment (expression has type "NAType",
246+
# base class "ObjectStringArrayMixin" defined the type as "float")
247+
_str_na_value = libmissing.NA # type: ignore[assignment]
247248

248249
def _str_map(
249250
self, f, na_value=None, dtype: Dtype | None = None, convert: bool = True

0 commit comments

Comments
 (0)