Skip to content

Commit 8462515

Browse files
Bug: Interchange protocol implementation does not allow for empty string columns (#56788)
* Handle non-string object dtypes in DataFrame interchange protocol * Add test * Add 'whats new' * Update pandas/core/interchange/column.py Co-authored-by: Matthew Roeschke <[email protected]> * Update pandas/tests/interchange/test_impl.py Co-authored-by: Matthew Roeschke <[email protected]> * resolve checks * Update not needed --------- Co-authored-by: Matthew Roeschke <[email protected]>
1 parent 1acf0d6 commit 8462515

File tree

3 files changed

+10
-2
lines changed

3 files changed

+10
-2
lines changed

doc/source/whatsnew/v2.2.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -936,6 +936,7 @@ Other
936936
- Bug in :func:`cut` and :func:`qcut` with ``datetime64`` dtype values with non-nanosecond units incorrectly returning nanosecond-unit bins (:issue:`56101`)
937937
- Bug in :func:`cut` incorrectly allowing cutting of timezone-aware datetimes with timezone-naive bins (:issue:`54964`)
938938
- Bug in :func:`infer_freq` and :meth:`DatetimeIndex.inferred_freq` with weekly frequencies and non-nanosecond resolutions (:issue:`55609`)
939+
- Bug in :func:`pd.api.interchange.from_dataframe` where it raised ``NotImplementedError`` when handling empty string columns (:issue:`56703`)
939940
- Bug in :meth:`DataFrame.apply` where passing ``raw=True`` ignored ``args`` passed to the applied function (:issue:`55009`)
940941
- Bug in :meth:`DataFrame.from_dict` which would always sort the rows of the created :class:`DataFrame`. (:issue:`55683`)
941942
- Bug in :meth:`DataFrame.sort_index` when passing ``axis="columns"`` and ``ignore_index=True`` raising a ``ValueError`` (:issue:`56478`)
@@ -944,7 +945,6 @@ Other
944945
- Bug in the error message when assigning an empty :class:`DataFrame` to a column (:issue:`55956`)
945946
- Bug when time-like strings were being cast to :class:`ArrowDtype` with ``pyarrow.time64`` type (:issue:`56463`)
946947

947-
948948
.. ---------------------------------------------------------------------------
949949
.. _whatsnew_220.contributors:
950950

pandas/core/interchange/column.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ def dtype(self) -> tuple[DtypeKind, int, str, str]:
116116
Endianness.NATIVE,
117117
)
118118
elif is_string_dtype(dtype):
119-
if infer_dtype(self._col) == "string":
119+
if infer_dtype(self._col) in ("string", "empty"):
120120
return (
121121
DtypeKind.STRING,
122122
8,

pandas/tests/interchange/test_impl.py

+8
Original file line numberDiff line numberDiff line change
@@ -355,6 +355,14 @@ def test_interchange_from_corrected_buffer_dtypes(monkeypatch) -> None:
355355
pd.api.interchange.from_dataframe(df)
356356

357357

358+
def test_empty_string_column():
359+
# https://github.com/pandas-dev/pandas/issues/56703
360+
df = pd.DataFrame({"a": []}, dtype=str)
361+
df2 = df.__dataframe__()
362+
result = pd.api.interchange.from_dataframe(df2)
363+
tm.assert_frame_equal(df, result)
364+
365+
358366
def test_large_string():
359367
# GH#56702
360368
pytest.importorskip("pyarrow")

0 commit comments

Comments
 (0)