Skip to content

API/BUG: Make to_json index= arg consistent with orient arg #52143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 11, 2023
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,12 @@ Other enhancements
- Implemented ``__pandas_priority__`` to allow custom types to take precedence over :class:`DataFrame`, :class:`Series`, :class:`Index`, or :class:`ExtensionArray` for arithmetic operations, :ref:`see the developer guide <extending.pandas_priority>` (:issue:`48347`)
- Improve error message when having incompatible columns using :meth:`DataFrame.merge` (:issue:`51861`)
- Improve error message when setting :class:`DataFrame` with wrong number of columns through :meth:`DataFrame.isetitem` (:issue:`51701`)
- Improved error handling when using :meth:`DataFrame.to_json` with incompatible ``index`` and ``orient`` arguments (:issue:`52143`)
- Improved error message when creating a DataFrame with empty data (0 rows), no index and an incorrect number of columns. (:issue:`52084`)
- Let :meth:`DataFrame.to_feather` accept a non-default :class:`Index` and non-string column names (:issue:`51787`)
- Performance improvement in :func:`read_csv` (:issue:`52632`) with ``engine="c"``
- Performance improvement in :func:`concat` with homogeneous ``np.float64`` or ``np.float32`` dtypes (:issue:`52685`)
- Performance improvement in :meth:`DataFrame.filter` when ``items`` is given (:issue:`52941`)
-

.. ---------------------------------------------------------------------------
.. _whatsnew_210.notable_bug_fixes:
Expand Down
11 changes: 6 additions & 5 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2307,7 +2307,7 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = None,
lines: bool_t = False,
compression: CompressionOptions = "infer",
index: bool_t = True,
index: bool_t | None = None,
indent: int | None = None,
storage_options: StorageOptions = None,
mode: Literal["a", "w"] = "w",
Expand Down Expand Up @@ -2376,10 +2376,11 @@ def to_json(

.. versionchanged:: 1.4.0 Zstandard support.

index : bool, default True
Whether to include the index values in the JSON string. Not
including the index (``index=False``) is only supported when
orient is 'split' or 'table'.
index : bool or None, default None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be easier to just say that the index is only used in split, index, column and table orients. Of those formats, index and column cannot be False.

You are kind of doing this now but I think in a way that is a bit more confusing. If you structure the commentary and code this will I think will help simplify the logic

Copy link
Contributor Author

@dshemetov dshemetov Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some changes, let me know what you think!

The index is only used when 'orient' is 'split', 'index', 'column',
or 'table'. Of these, 'index' and 'column' do not support
`index=False`.

indent : int, optional
Length of whitespace used to indent each record.

Expand Down
19 changes: 14 additions & 5 deletions pandas/io/json/_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = ...,
lines: bool = ...,
compression: CompressionOptions = ...,
index: bool = ...,
index: bool | None = ...,
indent: int = ...,
storage_options: StorageOptions = ...,
mode: Literal["a", "w"] = ...,
Expand All @@ -120,7 +120,7 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = ...,
lines: bool = ...,
compression: CompressionOptions = ...,
index: bool = ...,
index: bool | None = ...,
indent: int = ...,
storage_options: StorageOptions = ...,
mode: Literal["a", "w"] = ...,
Expand All @@ -139,15 +139,24 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = None,
lines: bool = False,
compression: CompressionOptions = "infer",
index: bool = True,
index: bool | None = None,
indent: int = 0,
storage_options: StorageOptions = None,
mode: Literal["a", "w"] = "w",
) -> str | None:
if not index and orient not in ["split", "table"]:
if orient in ["records", "values"] and index is True:
raise ValueError(
"'index=False' is only valid when 'orient' is 'split' or 'table'"
"'index=True' is only valid when 'orient' is 'split', 'table', "
"'index', or 'columns'."
)
elif orient in ["index", "columns"] and index is False:
raise ValueError(
"'index=False' is only valid when 'orient' is 'split', 'table', "
"'records', or 'values'."
)
elif index is None:
# will be ignored for orient='records' and 'values'
index = True

if lines and orient != "records":
raise ValueError("'lines' keyword only valid when 'orient' is records")
Expand Down
23 changes: 20 additions & 3 deletions pandas/tests/io/json/test_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -1472,17 +1472,34 @@ def test_index_false_to_json_table(self, data):

assert result == expected

@pytest.mark.parametrize("orient", ["records", "index", "columns", "values"])
@pytest.mark.parametrize("orient", ["index", "columns"])
def test_index_false_error_to_json(self, orient):
# GH 17394
# GH 17394, 25513
# Testing error message from to_json with index=False

df = DataFrame([[1, 2], [4, 5]], columns=["a", "b"])

msg = "'index=False' is only valid when 'orient' is 'split' or 'table'"
msg = (
"'index=False' is only valid when 'orient' is 'split', "
"'table', 'records', or 'values'"
)
with pytest.raises(ValueError, match=msg):
df.to_json(orient=orient, index=False)

@pytest.mark.parametrize("orient", ["records", "values"])
def test_index_true_error_to_json(self, orient):
# GH 25513
# Testing error message from to_json with index=True

df = DataFrame([[1, 2], [4, 5]], columns=["a", "b"])

msg = (
"'index=True' is only valid when 'orient' is 'split', "
"'table', 'index', or 'columns'"
)
with pytest.raises(ValueError, match=msg):
df.to_json(orient=orient, index=True)

@pytest.mark.parametrize("orient", ["split", "table"])
@pytest.mark.parametrize("index", [True, False])
def test_index_false_from_json_to_json(self, orient, index):
Expand Down