Skip to content

Commit 7d772d5

Browse files
committed
BUG: make JSONTableWriter fail if no index.name and 'index' in columns
This commit is itended to fix GH #58925. If index.name is empty it will use set_default_names inside __init__ to make check on overlapping names fail. Otherwise it's done during schema creation and not reflected on the dataframe itself which creates inconsistency between the data and its schema. add mention of the raised error to the `to_json` documentation move new logic description from IO docs to to_json docstring
1 parent ab433af commit 7d772d5

File tree

4 files changed

+13
-1
lines changed

4 files changed

+13
-1
lines changed

doc/source/whatsnew/v3.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -559,6 +559,7 @@ MultiIndex
559559
I/O
560560
^^^
561561
- Bug in :class:`DataFrame` and :class:`Series` ``repr`` of :py:class:`collections.abc.Mapping`` elements. (:issue:`57915`)
562+
- Bug in :meth:`.DataFrame.to_json` was producing corrupted record (data incompatible with schema) if 'index' was the name of a column and index.name was empty (which is replaced with generic 'index' internally), now it will fail on check if index.name is in columns (:issue:`58925`)
562563
- Bug in :meth:`DataFrame.to_dict` raises unnecessary ``UserWarning`` when columns are not unique and ``orient='tight'``. (:issue:`58281`)
563564
- Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
564565
- Bug in :meth:`DataFrame.to_stata` when writing :class:`DataFrame` and ``byteorder=`big```. (:issue:`58969`)

pandas/core/generic.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -2387,7 +2387,8 @@ def to_json(
23872387
index : bool or None, default None
23882388
The index is only used when 'orient' is 'split', 'index', 'column',
23892389
or 'table'. Of these, 'index' and 'column' do not support
2390-
`index=False`.
2390+
`index=False`. The string 'index' as a column name with empty :class:`Index`
2391+
or if it is 'index' will raise a ``ValueError``.
23912392
23922393
indent : int, optional
23932394
Length of whitespace used to indent each record.

pandas/io/json/_json.py

+3
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@
5959
from pandas.io.json._table_schema import (
6060
build_table_schema,
6161
parse_table_schema,
62+
set_default_names,
6263
)
6364
from pandas.io.parsers.readers import validate_integer
6465

@@ -353,6 +354,8 @@ def __init__(
353354
raise ValueError(msg)
354355

355356
self.schema = build_table_schema(obj, index=self.index)
357+
if self.index:
358+
obj = set_default_names(obj)
356359

357360
# NotImplemented on a column MultiIndex
358361
if obj.ndim == 2 and isinstance(obj.columns, MultiIndex):

pandas/tests/io/json/test_pandas.py

+7
Original file line numberDiff line numberDiff line change
@@ -1610,6 +1610,13 @@ def test_to_json_from_json_columns_dtypes(self, orient):
16101610
)
16111611
tm.assert_frame_equal(result, expected)
16121612

1613+
def test_to_json_with_index_as_a_column_name(self):
1614+
df = DataFrame(data={"index": [1, 2], "a": [2, 3]})
1615+
with pytest.raises(
1616+
ValueError, match="Overlapping names between the index and columns"
1617+
):
1618+
df.to_json(orient="table")
1619+
16131620
@pytest.mark.parametrize("dtype", [True, {"b": int, "c": int}])
16141621
def test_read_json_table_dtype_raises(self, dtype):
16151622
# GH21345

0 commit comments

Comments
 (0)