Skip to content

DF.__setitem__ creates extension column when given extension scalar #34875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Jul 11, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
0ec5911
Bugfix to make DF.__setitem__ create extension column instead of obje…
justinessert Jun 19, 2020
9336955
removed bad whitespace
justinessert Jun 19, 2020
01fb076
Apply suggestions from code review
justinessert Jun 22, 2020
5c8b356
added missing :
justinessert Jun 22, 2020
2c1f640
modified cast_extension_scalar_to_array test to include an Interval type
justinessert Jun 22, 2020
d509bf4
added user-facing test for extension type bug
justinessert Jun 22, 2020
e231bb1
fixed pep8 issues
justinessert Jun 22, 2020
18ed043
added note about bug in setting series to scalar extension type
justinessert Jun 22, 2020
a6b18f4
corrected order of imports
justinessert Jun 22, 2020
cbc29be
corrected order of imports
justinessert Jun 22, 2020
2f79822
fixed black formatting errors
justinessert Jun 22, 2020
0f9178e
removed extra comma
justinessert Jun 22, 2020
bfa18fb
updated cast_scalar_to_arr to support tuple shape for extension dtype
justinessert Jun 23, 2020
e7e9a48
removed unneeded code
justinessert Jun 23, 2020
291eb2d
added coverage for datetime with timezone in extension_array test
justinessert Jun 23, 2020
3a788ed
added TODO
justinessert Jun 23, 2020
38d7ce5
correct line that was too long
justinessert Jun 23, 2020
a5e8df5
fixed dtype issue with tz test
justinessert Jun 23, 2020
5e439bd
creating distinct arrays for each column
justinessert Jun 24, 2020
6cc7959
resolving mypy error
justinessert Jun 24, 2020
7e27a6e
added docstring info and test
justinessert Jun 24, 2020
90a8570
removed unneeded import
justinessert Jun 24, 2020
39b2984
flattened else case in init
justinessert Jun 26, 2020
7a01041
refactored extension type column fix
justinessert Jun 26, 2020
03e528b
reverted docstring changes
justinessert Jun 26, 2020
7bb9553
reverted docstring changes
justinessert Jun 26, 2020
a3be9a6
removed unneeded imports
justinessert Jun 26, 2020
3a92164
reverted test changes
justinessert Jun 26, 2020
c93a847
fixed construct_1d_arraylike bug
justinessert Jun 26, 2020
966283a
reorganized if statements
justinessert Jun 30, 2020
f2aea7b
moved what's new statement to correct file
justinessert Jun 30, 2020
6495a36
created new test for period df construction
justinessert Jun 30, 2020
42e7afa
added assert_frame_equal to period_data test
justinessert Jun 30, 2020
8343df3
Using pandas array instead of df constructor for better test
justinessert Jul 7, 2020
a50a42c
changed wording
justinessert Jul 7, 2020
3452c20
Merge branch 'master' of https://github.com/justinessert/pandas
justinessert Jul 7, 2020
6f3fb51
pylint fixes
justinessert Jul 7, 2020
b95cdfc
parameterized test and added comment
justinessert Jul 8, 2020
6830fde
removed extra comma
justinessert Jul 8, 2020
6653ef8
Merge branch 'master' into master
justinessert Jul 10, 2020
c73a2de
parameterized test
justinessert Jul 10, 2020
100f334
renamed test
justinessert Jul 10, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
ExtensionDtype,
IntervalDtype,
PeriodDtype,
registry
)
from pandas.core.dtypes.generic import (
ABCDataFrame,
Expand Down Expand Up @@ -1505,12 +1506,15 @@ def cast_scalar_to_array(shape, value, dtype: Optional[DtypeObj] = None) -> np.n

"""
if dtype is None:
dtype, fill_value = infer_dtype_from_scalar(value)
dtype, fill_value = infer_dtype_from_scalar(value, pandas_dtype=True)
else:
fill_value = value

values = np.empty(shape, dtype=dtype)
values.fill(fill_value)
if type(dtype) in registry.dtypes:
values = dtype.construct_array_type()._from_sequence([value] * shape)
else:
values = np.empty(shape, dtype=dtype)
values.fill(fill_value)

return values

Expand Down
20 changes: 18 additions & 2 deletions pandas/tests/dtypes/cast/test_infer_dtype.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
infer_dtype_from_scalar,
)
from pandas.core.dtypes.common import is_dtype_equal
from pandas.core.dtypes.dtypes import PeriodDtype

from pandas import (
Categorical,
Expand Down Expand Up @@ -187,14 +188,29 @@ def test_infer_dtype_from_array(arr, expected, pandas_dtype):
(1.1, np.float64),
(Timestamp("2011-01-01"), "datetime64[ns]"),
(Timestamp("2011-01-01", tz="US/Eastern"), object),
(Period("2011-01-01", freq="D"), object),
],
)
def test_cast_scalar_to_array(obj, dtype):
def test_cast_scalar_to_numpy_array(obj, dtype):
shape = (3, 2)

exp = np.empty(shape, dtype=dtype)
exp.fill(obj)

arr = cast_scalar_to_array(shape, obj, dtype=dtype)
tm.assert_numpy_array_equal(arr, exp)


@pytest.mark.parametrize(
"obj,dtype",
[
(Period("2011-01-01", freq="D"), PeriodDtype('D')),
(Period("2011-01", freq="M"), PeriodDtype('M')),
],
)
def test_cast_scalar_to_extension_array(obj, dtype):
shape = 3

exp = dtype.construct_array_type()._from_sequence([obj] * shape)

arr = cast_scalar_to_array(shape, obj, dtype=dtype)
tm.assert_extension_array_equal(arr, exp)