Skip to content

Commit 98d2b1e

Browse files
authored
Merge branch 'main' into dep_excel
2 parents a3efa14 + a04754e commit 98d2b1e

24 files changed

+339
-166
lines changed

doc/source/reference/groupby.rst

-4
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,6 @@ Function application
6565

6666
DataFrameGroupBy.all
6767
DataFrameGroupBy.any
68-
DataFrameGroupBy.backfill
6968
DataFrameGroupBy.bfill
7069
DataFrameGroupBy.corr
7170
DataFrameGroupBy.corrwith
@@ -94,7 +93,6 @@ Function application
9493
DataFrameGroupBy.nth
9594
DataFrameGroupBy.nunique
9695
DataFrameGroupBy.ohlc
97-
DataFrameGroupBy.pad
9896
DataFrameGroupBy.pct_change
9997
DataFrameGroupBy.prod
10098
DataFrameGroupBy.quantile
@@ -120,7 +118,6 @@ Function application
120118

121119
SeriesGroupBy.all
122120
SeriesGroupBy.any
123-
SeriesGroupBy.backfill
124121
SeriesGroupBy.bfill
125122
SeriesGroupBy.corr
126123
SeriesGroupBy.count
@@ -153,7 +150,6 @@ Function application
153150
SeriesGroupBy.nunique
154151
SeriesGroupBy.unique
155152
SeriesGroupBy.ohlc
156-
SeriesGroupBy.pad
157153
SeriesGroupBy.pct_change
158154
SeriesGroupBy.prod
159155
SeriesGroupBy.quantile

doc/source/reference/resampling.rst

-2
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,7 @@ Upsampling
3535
:toctree: api/
3636

3737
Resampler.ffill
38-
Resampler.backfill
3938
Resampler.bfill
40-
Resampler.pad
4139
Resampler.nearest
4240
Resampler.fillna
4341
Resampler.asfreq

doc/source/whatsnew/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Version 1.5
2424
.. toctree::
2525
:maxdepth: 2
2626

27+
v1.5.2
2728
v1.5.1
2829
v1.5.0
2930

doc/source/whatsnew/v1.5.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1246,4 +1246,4 @@ Other
12461246
Contributors
12471247
~~~~~~~~~~~~
12481248

1249-
.. contributors:: v1.4.4..v1.5.0|HEAD
1249+
.. contributors:: v1.4.4..v1.5.0

doc/source/whatsnew/v1.5.1.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. _whatsnew_151:
22

3-
What's new in 1.5.1 (October ??, 2022)
3+
What's new in 1.5.1 (October 19, 2022)
44
--------------------------------------
55

66
These are the changes in pandas 1.5.1. See :ref:`release` for a full changelog
@@ -111,12 +111,12 @@ Bug fixes
111111
Other
112112
~~~~~
113113
- Avoid showing deprecated signatures when introspecting functions with warnings about arguments becoming keyword-only (:issue:`48692`)
114-
-
115-
-
116114

117115
.. ---------------------------------------------------------------------------
118116
119117
.. _whatsnew_151.contributors:
120118

121119
Contributors
122120
~~~~~~~~~~~~
121+
122+
.. contributors:: v1.5.0..v1.5.1

doc/source/whatsnew/v1.5.2.rst

+41
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
.. _whatsnew_152:
2+
3+
What's new in 1.5.2 (November ??, 2022)
4+
---------------------------------------
5+
6+
These are the changes in pandas 1.5.2. See :ref:`release` for a full changelog
7+
including other versions of pandas.
8+
9+
{{ header }}
10+
11+
.. ---------------------------------------------------------------------------
12+
.. _whatsnew_152.regressions:
13+
14+
Fixed regressions
15+
~~~~~~~~~~~~~~~~~
16+
-
17+
-
18+
19+
.. ---------------------------------------------------------------------------
20+
.. _whatsnew_152.bug_fixes:
21+
22+
Bug fixes
23+
~~~~~~~~~
24+
-
25+
-
26+
27+
.. ---------------------------------------------------------------------------
28+
.. _whatsnew_152.other:
29+
30+
Other
31+
~~~~~
32+
-
33+
-
34+
35+
.. ---------------------------------------------------------------------------
36+
.. _whatsnew_152.contributors:
37+
38+
Contributors
39+
~~~~~~~~~~~~
40+
41+
.. contributors:: v1.5.1..v1.5.2|HEAD

doc/source/whatsnew/v2.0.0.rst

+4-1
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,7 @@ Removal of prior version deprecations/changes
147147
- Disallow passing non-round floats to :class:`Timestamp` with ``unit="M"`` or ``unit="Y"`` (:issue:`47266`)
148148
- Remove keywords ``convert_float`` and ``mangle_dupe_cols`` from :func:`read_excel` (:issue:`41176`)
149149
- Disallow passing non-keyword arguments to :func:`read_excel` except ``io`` and ``sheet_name`` (:issue:`34418`)
150+
- Remove :meth:`DataFrameGroupBy.pad` and :meth:`DataFrameGroupBy.backfill` (:issue:`45076`)
150151

151152
.. ---------------------------------------------------------------------------
152153
.. _whatsnew_200.performance:
@@ -220,7 +221,7 @@ Conversion
220221
- Bug in :meth:`DataFrame.eval` incorrectly raising an ``AttributeError`` when there are negative values in function call (:issue:`46471`)
221222
- Bug in :meth:`Series.convert_dtypes` not converting dtype to nullable dtype when :class:`Series` contains ``NA`` and has dtype ``object`` (:issue:`48791`)
222223
- Bug where any :class:`ExtensionDtype` subclass with ``kind="M"`` would be interpreted as a timezone type (:issue:`34986`)
223-
-
224+
- Bug in :class:`.arrays.ArrowExtensionArray` that would raise ``NotImplementedError`` when passed a sequence of strings or binary (:issue:`49172`)
224225

225226
Strings
226227
^^^^^^^
@@ -235,6 +236,7 @@ Interval
235236
Indexing
236237
^^^^^^^^
237238
- Bug in :meth:`DataFrame.reindex` filling with wrong values when indexing columns and index for ``uint`` dtypes (:issue:`48184`)
239+
- Bug in :meth:`DataFrame.loc` coercing dtypes when setting values with a list indexer (:issue:`49159`)
238240
- Bug in :meth:`DataFrame.__setitem__` raising ``ValueError`` when right hand side is :class:`DataFrame` with :class:`MultiIndex` columns (:issue:`49121`)
239241
- Bug in :meth:`DataFrame.reindex` casting dtype to ``object`` when :class:`DataFrame` has single extension array column when re-indexing ``columns`` and ``index`` (:issue:`48190`)
240242
- Bug in :func:`~DataFrame.describe` when formatting percentiles in the resulting index showed more decimals than needed (:issue:`46362`)
@@ -263,6 +265,7 @@ MultiIndex
263265
I/O
264266
^^^
265267
- Bug in :func:`read_sas` caused fragmentation of :class:`DataFrame` and raised :class:`.errors.PerformanceWarning` (:issue:`48595`)
268+
- Bug in :func:`read_csv` for a single-line csv with fewer columns than ``names`` raised :class:`.errors.ParserError` with ``engine="c"`` (:issue:`47566`)
266269
-
267270

268271
Period

pandas/_libs/parsers.pyx

+3
Original file line numberDiff line numberDiff line change
@@ -744,6 +744,8 @@ cdef class TextReader:
744744
elif self.names is not None:
745745
# Names passed
746746
if self.parser.lines < 1:
747+
if not self.has_usecols:
748+
self.parser.expected_fields = len(self.names)
747749
self._tokenize_rows(1)
748750

749751
header = [self.names]
@@ -756,6 +758,7 @@ cdef class TextReader:
756758
# Enforce this unless usecols
757759
if not self.has_usecols:
758760
self.parser.expected_fields = max(field_count, len(self.names))
761+
759762
else:
760763
# No header passed nor to be found in the file
761764
if self.parser.lines < 1:

pandas/_testing/__init__.py

+6-1
Original file line numberDiff line numberDiff line change
@@ -201,8 +201,11 @@
201201
SIGNED_INT_PYARROW_DTYPES = [pa.int8(), pa.int16(), pa.int32(), pa.int64()]
202202
ALL_INT_PYARROW_DTYPES = UNSIGNED_INT_PYARROW_DTYPES + SIGNED_INT_PYARROW_DTYPES
203203

204+
# pa.float16 doesn't seem supported
205+
# https://github.com/apache/arrow/blob/master/python/pyarrow/src/arrow/python/helpers.cc#L86
204206
FLOAT_PYARROW_DTYPES = [pa.float32(), pa.float64()]
205-
STRING_PYARROW_DTYPES = [pa.string(), pa.utf8()]
207+
STRING_PYARROW_DTYPES = [pa.string()]
208+
BINARY_PYARROW_DTYPES = [pa.binary()]
206209

207210
TIME_PYARROW_DTYPES = [
208211
pa.time32("s"),
@@ -225,6 +228,8 @@
225228
ALL_PYARROW_DTYPES = (
226229
ALL_INT_PYARROW_DTYPES
227230
+ FLOAT_PYARROW_DTYPES
231+
+ STRING_PYARROW_DTYPES
232+
+ BINARY_PYARROW_DTYPES
228233
+ TIME_PYARROW_DTYPES
229234
+ DATE_PYARROW_DTYPES
230235
+ DATETIME_PYARROW_DTYPES

pandas/core/arrays/arrow/array.py

+7-2
Original file line numberDiff line numberDiff line change
@@ -220,8 +220,13 @@ def _from_sequence_of_strings(
220220
Construct a new ExtensionArray from a sequence of strings.
221221
"""
222222
pa_type = to_pyarrow_type(dtype)
223-
if pa_type is None:
224-
# Let pyarrow try to infer or raise
223+
if (
224+
pa_type is None
225+
or pa.types.is_binary(pa_type)
226+
or pa.types.is_string(pa_type)
227+
):
228+
# pa_type is None: Let pa.array infer
229+
# pa_type is string/binary: scalars already correct type
225230
scalars = strings
226231
elif pa.types.is_timestamp(pa_type):
227232
from pandas.core.tools.datetimes import to_datetime

pandas/core/groupby/base.py

-2
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,6 @@ def maybe_normalize_deprecated_kernels(kernel) -> Literal["bfill", "ffill"]:
7373

7474
transformation_kernels = frozenset(
7575
[
76-
"backfill",
7776
"bfill",
7877
"cumcount",
7978
"cummax",
@@ -84,7 +83,6 @@ def maybe_normalize_deprecated_kernels(kernel) -> Literal["bfill", "ffill"]:
8483
"ffill",
8584
"fillna",
8685
"ngroup",
87-
"pad",
8886
"pct_change",
8987
"rank",
9088
"shift",

pandas/core/groupby/groupby.py

-50
Original file line numberDiff line numberDiff line change
@@ -2936,31 +2936,6 @@ def ffill(self, limit=None):
29362936
"""
29372937
return self._fill("ffill", limit=limit)
29382938

2939-
def pad(self, limit=None):
2940-
"""
2941-
Forward fill the values.
2942-
2943-
.. deprecated:: 1.4
2944-
Use ffill instead.
2945-
2946-
Parameters
2947-
----------
2948-
limit : int, optional
2949-
Limit of how many values to fill.
2950-
2951-
Returns
2952-
-------
2953-
Series or DataFrame
2954-
Object with missing values filled.
2955-
"""
2956-
warnings.warn(
2957-
"pad is deprecated and will be removed in a future version. "
2958-
"Use ffill instead.",
2959-
FutureWarning,
2960-
stacklevel=find_stack_level(),
2961-
)
2962-
return self.ffill(limit=limit)
2963-
29642939
@final
29652940
@Substitution(name="groupby")
29662941
def bfill(self, limit=None):
@@ -2986,31 +2961,6 @@ def bfill(self, limit=None):
29862961
"""
29872962
return self._fill("bfill", limit=limit)
29882963

2989-
def backfill(self, limit=None):
2990-
"""
2991-
Backward fill the values.
2992-
2993-
.. deprecated:: 1.4
2994-
Use bfill instead.
2995-
2996-
Parameters
2997-
----------
2998-
limit : int, optional
2999-
Limit of how many values to fill.
3000-
3001-
Returns
3002-
-------
3003-
Series or DataFrame
3004-
Object with missing values filled.
3005-
"""
3006-
warnings.warn(
3007-
"backfill is deprecated and will be removed in a future version. "
3008-
"Use bfill instead.",
3009-
FutureWarning,
3010-
stacklevel=find_stack_level(),
3011-
)
3012-
return self.bfill(limit=limit)
3013-
30142964
@final
30152965
@Substitution(name="groupby")
30162966
@Substitution(see_also=_common_see_also)

pandas/core/indexing.py

+8-4
Original file line numberDiff line numberDiff line change
@@ -1898,16 +1898,20 @@ def _setitem_with_indexer_2d_value(self, indexer, value):
18981898

18991899
ilocs = self._ensure_iterable_column_indexer(indexer[1])
19001900

1901-
# GH#7551 Note that this coerces the dtype if we are mixed
1902-
value = np.array(value, dtype=object)
1901+
if not is_array_like(value):
1902+
# cast lists to array
1903+
value = np.array(value, dtype=object)
19031904
if len(ilocs) != value.shape[1]:
19041905
raise ValueError(
19051906
"Must have equal len keys and value when setting with an ndarray"
19061907
)
19071908

19081909
for i, loc in enumerate(ilocs):
1909-
# setting with a list, re-coerces
1910-
self._setitem_single_column(loc, value[:, i].tolist(), pi)
1910+
value_col = value[:, i]
1911+
if is_object_dtype(value_col.dtype):
1912+
# casting to list so that we do type inference in setitem_single_column
1913+
value_col = value_col.tolist()
1914+
self._setitem_single_column(loc, value_col, pi)
19111915

19121916
def _setitem_with_indexer_frame_value(self, indexer, value: DataFrame, name: str):
19131917
ilocs = self._ensure_iterable_column_indexer(indexer[1])

pandas/core/resample.py

-51
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@
1111
final,
1212
no_type_check,
1313
)
14-
import warnings
1514

1615
import numpy as np
1716

@@ -50,7 +49,6 @@
5049
deprecate_nonkeyword_arguments,
5150
doc,
5251
)
53-
from pandas.util._exceptions import find_stack_level
5452

5553
from pandas.core.dtypes.generic import (
5654
ABCDataFrame,
@@ -562,30 +560,6 @@ def ffill(self, limit=None):
562560
"""
563561
return self._upsample("ffill", limit=limit)
564562

565-
def pad(self, limit=None):
566-
"""
567-
Forward fill the values.
568-
569-
.. deprecated:: 1.4
570-
Use ffill instead.
571-
572-
Parameters
573-
----------
574-
limit : int, optional
575-
Limit of how many values to fill.
576-
577-
Returns
578-
-------
579-
An upsampled Series.
580-
"""
581-
warnings.warn(
582-
"pad is deprecated and will be removed in a future version. "
583-
"Use ffill instead.",
584-
FutureWarning,
585-
stacklevel=find_stack_level(),
586-
)
587-
return self.ffill(limit=limit)
588-
589563
def nearest(self, limit=None):
590564
"""
591565
Resample by using the nearest value.
@@ -748,31 +722,6 @@ def bfill(self, limit=None):
748722
"""
749723
return self._upsample("bfill", limit=limit)
750724

751-
def backfill(self, limit=None):
752-
"""
753-
Backward fill the values.
754-
755-
.. deprecated:: 1.4
756-
Use bfill instead.
757-
758-
Parameters
759-
----------
760-
limit : int, optional
761-
Limit of how many values to fill.
762-
763-
Returns
764-
-------
765-
Series, DataFrame
766-
An upsampled Series or DataFrame with backward filled NaN values.
767-
"""
768-
warnings.warn(
769-
"backfill is deprecated and will be removed in a future version. "
770-
"Use bfill instead.",
771-
FutureWarning,
772-
stacklevel=find_stack_level(),
773-
)
774-
return self.bfill(limit=limit)
775-
776725
def fillna(self, method, limit=None):
777726
"""
778727
Fill missing values introduced by upsampling.

0 commit comments

Comments
 (0)