Skip to content

Commit 25a4086

Browse files
committed
Merge remote-tracking branch 'upstream/master' into astype-keep-name
2 parents 6c0c76f + 8ac101d commit 25a4086

35 files changed

+465
-393
lines changed

.github/workflows/ci.yml

Lines changed: 17 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -125,68 +125,32 @@ jobs:
125125
- name: Check ipython directive errors
126126
run: "! grep -B1 \"^<<<-------------------------------------------------------------------------$\" sphinx.log"
127127

128-
- name: Merge website and docs
129-
run: |
130-
mkdir -p pandas_web/docs
131-
cp -r web/build/* pandas_web/
132-
cp -r doc/build/html/* pandas_web/docs/
133-
if: github.event_name == 'push'
134-
135128
- name: Install Rclone
136129
run: sudo apt install rclone -y
137130
if: github.event_name == 'push'
138131

139132
- name: Set up Rclone
140133
run: |
141-
RCLONE_CONFIG_PATH=$HOME/.config/rclone/rclone.conf
142-
mkdir -p `dirname $RCLONE_CONFIG_PATH`
143-
echo "[ovh_cloud_pandas_web]" > $RCLONE_CONFIG_PATH
144-
echo "type = swift" >> $RCLONE_CONFIG_PATH
145-
echo "env_auth = false" >> $RCLONE_CONFIG_PATH
146-
echo "auth_version = 3" >> $RCLONE_CONFIG_PATH
147-
echo "auth = https://auth.cloud.ovh.net/v3/" >> $RCLONE_CONFIG_PATH
148-
echo "endpoint_type = public" >> $RCLONE_CONFIG_PATH
149-
echo "tenant_domain = default" >> $RCLONE_CONFIG_PATH
150-
echo "tenant = 2977553886518025" >> $RCLONE_CONFIG_PATH
151-
echo "domain = default" >> $RCLONE_CONFIG_PATH
152-
echo "user = w4KGs3pmDxpd" >> $RCLONE_CONFIG_PATH
153-
echo "key = ${{ secrets.ovh_object_store_key }}" >> $RCLONE_CONFIG_PATH
154-
echo "region = BHS" >> $RCLONE_CONFIG_PATH
134+
CONF=$HOME/.config/rclone/rclone.conf
135+
mkdir -p `dirname $CONF`
136+
echo "[ovh_host]" > $CONF
137+
echo "type = swift" >> $CONF
138+
echo "env_auth = false" >> $CONF
139+
echo "auth_version = 3" >> $CONF
140+
echo "auth = https://auth.cloud.ovh.net/v3/" >> $CONF
141+
echo "endpoint_type = public" >> $CONF
142+
echo "tenant_domain = default" >> $CONF
143+
echo "tenant = 2977553886518025" >> $CONF
144+
echo "domain = default" >> $CONF
145+
echo "user = w4KGs3pmDxpd" >> $CONF
146+
echo "key = ${{ secrets.ovh_object_store_key }}" >> $CONF
147+
echo "region = BHS" >> $CONF
155148
if: github.event_name == 'push'
156149

157150
- name: Sync web with OVH
158-
run: rclone sync pandas_web ovh_cloud_pandas_web:dev
159-
if: github.event_name == 'push'
160-
161-
- name: Create git repo to upload the built docs to GitHub pages
162-
run: |
163-
cd pandas_web
164-
git init
165-
touch .nojekyll
166-
echo "dev.pandas.io" > CNAME
167-
printf "User-agent: *\nDisallow: /" > robots.txt
168-
git add --all .
169-
git config user.email "[email protected]"
170-
git config user.name "pandas-bot"
171-
git commit -m "pandas web and documentation in master"
151+
run: rclone sync --exclude pandas-docs/** web/build ovh_host:prod
172152
if: github.event_name == 'push'
173153

174-
# For this task to work, next steps are required:
175-
# 1. Generate a pair of private/public keys (i.e. `ssh-keygen -t rsa -b 4096 -C "[email protected]"`)
176-
# 2. Go to https://github.com/pandas-dev/pandas/settings/secrets
177-
# 3. Click on "Add a new secret"
178-
# 4. Name: "github_pagas_ssh_key", Value: <Content of the private ssh key>
179-
# 5. The public key needs to be upladed to https://github.com/pandas-dev/pandas-dev.github.io/settings/keys
180-
- name: Install GitHub pages ssh deployment key
181-
uses: shimataro/ssh-key-action@v2
182-
with:
183-
key: ${{ secrets.github_pages_ssh_key }}
184-
known_hosts: 'github.com,192.30.252.128 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ=='
185-
if: github.event_name == 'push'
186-
187-
- name: Publish web and docs to GitHub pages
188-
run: |
189-
cd pandas_web
190-
git remote add origin [email protected]:pandas-dev/pandas-dev.github.io.git
191-
git push -f origin master || true
154+
- name: Sync dev docs with OVH
155+
run: rclone sync doc/build/html ovh_host:prod/pandas-docs/dev
192156
if: github.event_name == 'push'

doc/source/development/meeting.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ This calendar shows all the developer meetings.
2525
You can subscribe to this calendar with the following links:
2626

2727
* `iCal <https://calendar.google.com/calendar/ical/pgbn14p6poja8a1cf2dv2jhrmg%40group.calendar.google.com/public/basic.ics>`__
28-
* `Google calendar <https://calendar.google.com/calendar/embed?src=pgbn14p6poja8a1cf2dv2jhrmg%40group.calendar.google.com>`__
28+
* `Google calendar <https://calendar.google.com/calendar/r?cid=pgbn14p6poja8a1cf2dv2jhrmg@group.calendar.google.com>`__
2929

3030
Additionally, we'll sometimes have one-off meetings on specific topics.
3131
These will be published on the same calendar.

doc/source/whatsnew/v1.0.2.rst

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -17,16 +17,17 @@ Fixed regressions
1717

1818
- Fixed regression in :meth:`DataFrame.to_excel` when ``columns`` kwarg is passed (:issue:`31677`)
1919
- Fixed regression in :meth:`Series.align` when ``other`` is a DataFrame and ``method`` is not None (:issue:`31785`)
20-
- Fixed regression in :meth:`pandas.core.groupby.RollingGroupby.apply` where the ``raw`` parameter was ignored (:issue:`31754`)
21-
- Fixed regression in :meth:`pandas.core.window.Rolling.corr` when using a time offset (:issue:`31789`)
22-
- Fixed regression in :meth:`pandas.core.groupby.DataFrameGroupBy.nunique` which was modifying the original values if ``NaN`` values were present (:issue:`31950`)
20+
- Fixed regression in ``groupby(..).rolling(..).apply()`` (``RollingGroupby``) where the ``raw`` parameter was ignored (:issue:`31754`)
21+
- Fixed regression in :meth:`rolling(..).corr() <pandas.core.window.rolling.Rolling.corr>` when using a time offset (:issue:`31789`)
22+
- Fixed regression in :meth:`groupby(..).nunique() <pandas.core.groupby.DataFrameGroupBy.nunique>` which was modifying the original values if ``NaN`` values were present (:issue:`31950`)
2323
- Fixed regression where :func:`read_pickle` raised a ``UnicodeDecodeError`` when reading a py27 pickle with :class:`MultiIndex` column (:issue:`31988`).
2424
- Fixed regression in :class:`DataFrame` arithmetic operations with mis-matched columns (:issue:`31623`)
25-
- Fixed regression in :meth:`pandas.core.groupby.GroupBy.agg` calling a user-provided function an extra time on an empty input (:issue:`31760`)
26-
- Joining on :class:`DatetimeIndex` or :class:`TimedeltaIndex` will preserve ``freq`` in simple cases (:issue:`32166`)
27-
- Fixed bug in the repr of an object-dtype :class:`Index` with bools and missing values (:issue:`32146`)
25+
- Fixed regression in :meth:`groupby(..).agg() <pandas.core.groupby.GroupBy.agg>` calling a user-provided function an extra time on an empty input (:issue:`31760`)
26+
- Fixed regression in joining on :class:`DatetimeIndex` or :class:`TimedeltaIndex` to preserve ``freq`` in simple cases (:issue:`32166`)
27+
- Fixed regression in the repr of an object-dtype :class:`Index` with bools and missing values (:issue:`32146`)
2828
- Fixed regression in :meth:`read_csv` in which the ``encoding`` option was not recognized with certain file-like objects (:issue:`31819`)
29-
-
29+
- Fixed regression in :meth:`DataFrame.reindex` and :meth:`Series.reindex` when reindexing with (tz-aware) index and ``method=nearest`` (:issue:`26683`)
30+
3031

3132
.. ---------------------------------------------------------------------------
3233
@@ -64,7 +65,6 @@ Bug fixes
6465

6566
**Datetimelike**
6667

67-
- Bug in :meth:`DataFrame.reindex` and :meth:`Series.reindex` when reindexing with a tz-aware index (:issue:`26683`)
6868
- Bug in :meth:`Series.astype` not copying for tz-naive and tz-aware datetime64 dtype (:issue:`32490`)
6969
- Bug where :func:`to_datetime` would raise when passed ``pd.NA`` (:issue:`32213`)
7070
- Improved error message when subtracting two :class:`Timestamp` that result in an out-of-bounds :class:`Timedelta` (:issue:`31774`)
@@ -78,16 +78,17 @@ Bug fixes
7878
**I/O**
7979

8080
- Using ``pd.NA`` with :meth:`DataFrame.to_json` now correctly outputs a null value instead of an empty object (:issue:`31615`)
81+
- Bug in :meth:`pandas.json_normalize` when value in meta path is not iterable (:issue:`31507`)
8182
- Fixed pickling of ``pandas.NA``. Previously a new object was returned, which broke computations relying on ``NA`` being a singleton (:issue:`31847`)
8283
- Fixed bug in parquet roundtrip with nullable unsigned integer dtypes (:issue:`31896`).
8384

8485
**Experimental dtypes**
8586

86-
- Fix bug in :meth:`DataFrame.convert_dtypes` for columns that were already using the ``"string"`` dtype (:issue:`31731`).
87+
- Fixed bug in :meth:`DataFrame.convert_dtypes` for columns that were already using the ``"string"`` dtype (:issue:`31731`).
88+
- Fixed bug in :meth:`DataFrame.convert_dtypes` for series with mix of integers and strings (:issue:`32117`)
89+
- Fixed bug in :meth:`DataFrame.convert_dtypes` where ``BooleanDtype`` columns were converted to ``Int64`` (:issue:`32287`)
8790
- Fixed bug in setting values using a slice indexer with string dtype (:issue:`31772`)
8891
- Fixed bug where :meth:`pandas.core.groupby.GroupBy.first` and :meth:`pandas.core.groupby.GroupBy.last` would raise a ``TypeError`` when groups contained ``pd.NA`` in a column of object dtype (:issue:`32123`)
89-
- Fix bug in :meth:`Series.convert_dtypes` for series with mix of integers and strings (:issue:`32117`)
90-
- Fixed bug in :meth:`DataFrame.convert_dtypes`, where ``BooleanDtype`` columns were converted to ``Int64`` (:issue:`32287`)
9192

9293
**Strings**
9394

doc/source/whatsnew/v1.1.0.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,9 @@ Performance improvements
188188
- Performance improvement in :class:`Timedelta` constructor (:issue:`30543`)
189189
- Performance improvement in :class:`Timestamp` constructor (:issue:`30543`)
190190
- Performance improvement in flex arithmetic ops between :class:`DataFrame` and :class:`Series` with ``axis=0`` (:issue:`31296`)
191-
-
191+
- The internal :meth:`Index._shallow_copy` now copies cached attributes over to the new index,
192+
avoiding creating these again on the new index. This can speed up many operations
193+
that depend on creating copies of existing indexes (:issue:`28584`)
192194

193195
.. ---------------------------------------------------------------------------
194196
@@ -263,7 +265,7 @@ Indexing
263265
- Bug in :meth:`Series.xs` incorrectly returning ``Timestamp`` instead of ``datetime64`` in some object-dtype cases (:issue:`31630`)
264266
- Bug in :meth:`DataFrame.iat` incorrectly returning ``Timestamp`` instead of ``datetime`` in some object-dtype cases (:issue:`32809`)
265267
- Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` when indexing with an integer key on a object-dtype :class:`Index` that is not all-integers (:issue:`31905`)
266-
-
268+
- Bug in :meth:`DataFrame.iloc.__setitem__` on a :class:`DataFrame` with duplicate columns incorrectly setting values for all matching columns (:issue:`15686`, :issue:`22036`)
267269

268270
Missing
269271
^^^^^^^

pandas/core/arrays/categorical.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1317,7 +1317,7 @@ def __setstate__(self, state):
13171317
setattr(self, k, v)
13181318

13191319
@property
1320-
def T(self):
1320+
def T(self) -> "Categorical":
13211321
"""
13221322
Return transposed numpy array.
13231323
"""

pandas/core/arrays/sparse/array.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1296,14 +1296,14 @@ def mean(self, axis=0, *args, **kwargs):
12961296
nsparse = self.sp_index.ngaps
12971297
return (sp_sum + self.fill_value * nsparse) / (ct + nsparse)
12981298

1299-
def transpose(self, *axes):
1299+
def transpose(self, *axes) -> "SparseArray":
13001300
"""
13011301
Returns the SparseArray.
13021302
"""
13031303
return self
13041304

13051305
@property
1306-
def T(self):
1306+
def T(self) -> "SparseArray":
13071307
"""
13081308
Returns the SparseArray.
13091309
"""

pandas/core/computation/pytables.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
from pandas.core.computation.common import _ensure_decoded
1818
from pandas.core.computation.expr import BaseExprVisitor
1919
from pandas.core.computation.ops import UndefinedVariableError, is_term
20+
from pandas.core.construction import extract_array
2021

2122
from pandas.io.formats.printing import pprint_thing, pprint_thing_encoded
2223

@@ -202,7 +203,7 @@ def stringify(value):
202203
v = Timedelta(v, unit="s").value
203204
return TermValue(int(v), v, kind)
204205
elif meta == "category":
205-
metadata = com.values_from_object(self.metadata)
206+
metadata = extract_array(self.metadata, extract_numpy=True)
206207
result = metadata.searchsorted(v, side="left")
207208

208209
# result returns 0 if v is first element or if v is not in metadata

pandas/core/frame.py

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@
7777
ensure_platform_int,
7878
infer_dtype_from_object,
7979
is_bool_dtype,
80+
is_datetime64_any_dtype,
8081
is_dict_like,
8182
is_dtype_equal,
8283
is_extension_array_dtype,
@@ -88,6 +89,7 @@
8889
is_list_like,
8990
is_named_tuple,
9091
is_object_dtype,
92+
is_period_dtype,
9193
is_scalar,
9294
is_sequence,
9395
needs_i8_conversion,
@@ -2443,7 +2445,9 @@ def transpose(self, *args, copy: bool = False) -> "DataFrame":
24432445

24442446
return result.__finalize__(self)
24452447

2446-
T = property(transpose)
2448+
@property
2449+
def T(self) -> "DataFrame":
2450+
return self.transpose()
24472451

24482452
# ----------------------------------------------------------------------
24492453
# Indexing Methods
@@ -2706,6 +2710,20 @@ def _setitem_frame(self, key, value):
27062710
self._check_setitem_copy()
27072711
self._where(-key, value, inplace=True)
27082712

2713+
def _iset_item(self, loc: int, value):
2714+
self._ensure_valid_index(value)
2715+
2716+
# technically _sanitize_column expects a label, not a position,
2717+
# but the behavior is the same as long as we pass broadcast=False
2718+
value = self._sanitize_column(loc, value, broadcast=False)
2719+
NDFrame._iset_item(self, loc, value)
2720+
2721+
# check if we are modifying a copy
2722+
# try to set first as we want an invalid
2723+
# value exception to occur first
2724+
if len(self):
2725+
self._check_setitem_copy()
2726+
27092727
def _set_item(self, key, value):
27102728
"""
27112729
Add series to DataFrame in specified column.
@@ -7789,11 +7807,13 @@ def _reduce(
77897807
self, op, name, axis=0, skipna=True, numeric_only=None, filter_type=None, **kwds
77907808
):
77917809

7792-
dtype_is_dt = self.dtypes.apply(lambda x: x.kind == "M")
7810+
dtype_is_dt = self.dtypes.apply(
7811+
lambda x: is_datetime64_any_dtype(x) or is_period_dtype(x)
7812+
)
77937813
if numeric_only is None and name in ["mean", "median"] and dtype_is_dt.any():
77947814
warnings.warn(
77957815
"DataFrame.mean and DataFrame.median with numeric_only=None "
7796-
"will include datetime64 and datetime64tz columns in a "
7816+
"will include datetime64, datetime64tz, and PeriodDtype columns in a "
77977817
"future version.",
77987818
FutureWarning,
77997819
stacklevel=3,
@@ -7854,6 +7874,10 @@ def blk_func(values):
78547874
assert len(res) == max(list(res.keys())) + 1, res.keys()
78557875
out = df._constructor_sliced(res, index=range(len(res)), dtype=out_dtype)
78567876
out.index = df.columns
7877+
if axis == 0 and df.dtypes.apply(needs_i8_conversion).any():
7878+
# FIXME: needs_i8_conversion check is kludge, not sure
7879+
# why it is necessary in this case and this case alone
7880+
out[:] = coerce_to_dtypes(out.values, df.dtypes)
78577881
return out
78587882

78597883
if numeric_only is None:

pandas/core/generic.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -967,7 +967,6 @@ def rename(
967967
continue
968968

969969
ax = self._get_axis(axis_no)
970-
baxis = self._get_block_manager_axis(axis_no)
971970
f = com.get_rename_function(replacements)
972971

973972
if level is not None:
@@ -984,9 +983,8 @@ def rename(
984983
]
985984
raise KeyError(f"{missing_labels} not found in axis")
986985

987-
result._data = result._data.rename_axis(
988-
f, axis=baxis, copy=copy, level=level
989-
)
986+
new_index = ax._transform_index(f, level)
987+
result.set_axis(new_index, axis=axis_no, inplace=True)
990988
result._clear_item_cache()
991989

992990
if inplace:
@@ -3579,6 +3577,10 @@ def _slice(self: FrameOrSeries, slobj: slice, axis=0) -> FrameOrSeries:
35793577
result._set_is_copy(self, copy=is_copy)
35803578
return result
35813579

3580+
def _iset_item(self, loc: int, value) -> None:
3581+
self._data.iset(loc, value)
3582+
self._clear_item_cache()
3583+
35823584
def _set_item(self, key, value) -> None:
35833585
self._data.set(key, value)
35843586
self._clear_item_cache()

pandas/core/groupby/generic.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@
4949
is_scalar,
5050
needs_i8_conversion,
5151
)
52-
from pandas.core.dtypes.missing import _isna_ndarraylike, isna, notna
52+
from pandas.core.dtypes.missing import isna, notna
5353

5454
from pandas.core.aggregation import (
5555
is_multi_agg_with_relabel,
@@ -1772,10 +1772,8 @@ def count(self):
17721772
ids, _, ngroups = self.grouper.group_info
17731773
mask = ids != -1
17741774

1775-
vals = (
1776-
(mask & ~_isna_ndarraylike(np.atleast_2d(blk.get_values())))
1777-
for blk in data.blocks
1778-
)
1775+
# TODO(2DEA): reshape would not be necessary with 2D EAs
1776+
vals = ((mask & ~isna(blk.values).reshape(blk.shape)) for blk in data.blocks)
17791777
locs = (blk.mgr_locs for blk in data.blocks)
17801778

17811779
counted = (

pandas/core/groupby/ops.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ def indices(self):
217217
return self.groupings[0].indices
218218
else:
219219
codes_list = [ping.codes for ping in self.groupings]
220-
keys = [com.values_from_object(ping.group_index) for ping in self.groupings]
220+
keys = [ping.group_index for ping in self.groupings]
221221
return get_indexer_dict(codes_list, keys)
222222

223223
@property

0 commit comments

Comments
 (0)