Skip to content

BUG: misleading error creating df from 2d array #42646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 28, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.4.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,7 @@ Groupby/resample/rolling

Reshaping
^^^^^^^^^
- Improved error message when creating a :class:`DataFrame` column from a multi-dimensional :class:`numpy.ndarray` (:issue:`42463`)
- :func:`concat` creating :class:`MultiIndex` with duplicate level entries when concatenating a :class:`DataFrame` with duplicates in :class:`Index` and multiple keys (:issue:`42651`)
- Bug in :meth:`pandas.cut` on :class:`Series` with duplicate indices (:issue:`42185`) and non-exact :meth:`pandas.CategoricalIndex` (:issue:`42425`)
-
Expand Down
2 changes: 2 additions & 0 deletions pandas/core/internals/construction.py
Original file line number Diff line number Diff line change
Expand Up @@ -615,6 +615,8 @@ def _extract_index(data) -> Index:
elif is_list_like(val) and getattr(val, "ndim", 1) == 1:
have_raw_arrays = True
raw_lengths.append(len(val))
elif isinstance(val, np.ndarray) and val.ndim > 1:
raise ValueError("Per-column arrays must each be 1-dimensional")

if not indexes and not raw_lengths:
raise ValueError("If using all scalar values, you must pass an index")
Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/frame/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -2530,6 +2530,19 @@ def test_from_2d_object_array_of_periods_or_intervals(self):
expected = DataFrame({0: pi, 1: ii, 2: pi, 3: ii})
tm.assert_frame_equal(df3, expected)

@pytest.mark.parametrize(
"col_a, col_b",
[
([[1], [2]], np.array([[1], [2]])),
(np.array([[1], [2]]), [[1], [2]]),
(np.array([[1], [2]]), np.array([[1], [2]])),
],
)
def test_error_from_2darray(self, col_a, col_b):
msg = "Per-column arrays must each be 1-dimensional"
with pytest.raises(ValueError, match=msg):
DataFrame({"a": col_a, "b": col_b})


class TestDataFrameConstructorWithDtypeCoercion:
def test_floating_values_integer_dtype(self):
Expand Down