-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix construction of Series from dict with nested lists #18626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
9ef97f7
f8ac3e6
c6aa204
14f0443
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4048,7 +4048,8 @@ def _try_cast(arr, take_fast_path): | |
|
||
# GH #846 | ||
if isinstance(data, (np.ndarray, Index, Series)): | ||
|
||
if data.ndim > 1: | ||
raise ValueError('Data must be 1-dimensional') | ||
if dtype is not None: | ||
subarr = np.array(data, copy=False) | ||
|
||
|
@@ -4085,7 +4086,9 @@ def _try_cast(arr, take_fast_path): | |
return subarr | ||
|
||
elif isinstance(data, (list, tuple)) and len(data) > 0: | ||
if dtype is not None: | ||
if all(is_list_like(item) for item in data): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not sure this is the appropriate place for this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (let me try to persuade you that other things were out of place!) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For performance, can't we first try the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure this would be simple to do... but anyway, notice that
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just tested:
and the time for such checking is actually negligible compared to the time that the actual conversion to array takes. So indeed not needed for performance reasons. (if it would not have been the case, I think keeping 'normal' list -> series fast would be higher in priority than the possible slowdown in converting list of lists) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I totally agree There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By the way: if/when there will be an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would like to move this loggic to |
||
subarr = construct_1d_object_array_from_listlike(data) | ||
elif dtype is not None: | ||
try: | ||
subarr = _try_cast(data, False) | ||
except Exception: | ||
|
@@ -4107,11 +4110,12 @@ def _try_cast(arr, take_fast_path): | |
else: | ||
subarr = _try_cast(data, False) | ||
|
||
# scalar like, GH | ||
if getattr(subarr, 'ndim', 0) == 0: | ||
if isinstance(data, list): # pragma: no cover | ||
subarr = np.array(data, dtype=object) | ||
elif index is not None: | ||
if subarr.ndim == 0 or is_scalar(data): | ||
if subarr.ndim == 1: | ||
# a scalar upcasted to 1-dimensional by maybe_cast_to_datetime() | ||
value = subarr[0] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how is this branch ever hit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
dtype = subarr.dtype | ||
else: | ||
value = data | ||
|
||
# figure out the dtype from the value (upcast if necessary) | ||
|
@@ -4121,26 +4125,7 @@ def _try_cast(arr, take_fast_path): | |
# need to possibly convert the value here | ||
value = maybe_cast_to_datetime(value, dtype) | ||
|
||
subarr = construct_1d_arraylike_from_scalar( | ||
value, len(index), dtype) | ||
|
||
else: | ||
return subarr.item() | ||
|
||
# the result that we want | ||
elif subarr.ndim == 1: | ||
if index is not None: | ||
|
||
# a 1-element ndarray | ||
if len(subarr) != len(index) and len(subarr) == 1: | ||
subarr = construct_1d_arraylike_from_scalar( | ||
subarr[0], len(index), subarr.dtype) | ||
|
||
elif subarr.ndim > 1: | ||
if isinstance(data, np.ndarray): | ||
raise Exception('Data must be 1-dimensional') | ||
else: | ||
subarr = com._asarray_tuplesafe(data, dtype=dtype) | ||
subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype) | ||
|
||
# This is to prevent mixed-type Series getting all casted to | ||
# NumPy string type, e.g. NaN --> '-1#IND'. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3215,8 +3215,7 @@ def test_nan_stays_float(self): | |
assert pd.isna(idx0.get_level_values(1)).all() | ||
# the following failed in 0.14.1 | ||
assert pd.isna(idxm.get_level_values(1)[:-1]).all() | ||
|
||
df0 = pd.DataFrame([[1, 2]], index=idx0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the issue here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the (outer) list has a different length than the index |
||
df0 = pd.DataFrame([[1, 2]] * 2, index=idx0) | ||
df1 = pd.DataFrame([[3, 4]], index=idx1) | ||
dfm = df0 - df1 | ||
assert pd.isna(df0.index.get_level_values(1)).all() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually should error on data.ndim != 1 (e.g. a numpy scalar of ndim==0 I think hits this path maybe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add tests to cover this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we allow numpy scalars? (just like normal scalars work, eg like
Series(1)
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure we do! And this is widely tested (as opposed to the initialization with length 1 list-likes, which was working by accident, accidentally used in the tests, and I now disabled).