Skip to content

API/BUG: Make to_json index= arg consistent with orient arg #52143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 11, 2023
17 changes: 12 additions & 5 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2268,7 +2268,7 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = None,
lines: bool_t = False,
compression: CompressionOptions = "infer",
index: bool_t = True,
index: bool_t | None = None,
indent: int | None = None,
storage_options: StorageOptions = None,
mode: Literal["a", "w"] = "w",
Expand Down Expand Up @@ -2337,10 +2337,17 @@ def to_json(

.. versionchanged:: 1.4.0 Zstandard support.

index : bool, default True
Whether to include the index values in the JSON string. Not
including the index (``index=False``) is only supported when
orient is 'split' or 'table'.
index : bool or None, default None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be easier to just say that the index is only used in split, index, column and table orients. Of those formats, index and column cannot be False.

You are kind of doing this now but I think in a way that is a bit more confusing. If you structure the commentary and code this will I think will help simplify the logic

Copy link
Contributor Author

@dshemetov dshemetov Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some changes, let me know what you think!

Whether to include the index values in the JSON string. Different
defaults and options depend on the 'orient' argument:

- 'split': default True, can also be False
- 'records': default False, cannot be True
- 'index': default True, cannot be False
- 'columns': default True, cannot be False
- 'values': default False, cannot be True
- 'table': default True, can also be False

indent : int, optional
Length of whitespace used to indent each record.

Expand Down
22 changes: 17 additions & 5 deletions pandas/io/json/_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = ...,
lines: bool = ...,
compression: CompressionOptions = ...,
index: bool = ...,
index: bool | None = ...,
indent: int = ...,
storage_options: StorageOptions = ...,
mode: Literal["a", "w"] = ...,
Expand All @@ -121,7 +121,7 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = ...,
lines: bool = ...,
compression: CompressionOptions = ...,
index: bool = ...,
index: bool | None = ...,
indent: int = ...,
storage_options: StorageOptions = ...,
mode: Literal["a", "w"] = ...,
Expand All @@ -140,14 +140,26 @@ def to_json(
default_handler: Callable[[Any], JSONSerializable] | None = None,
lines: bool = False,
compression: CompressionOptions = "infer",
index: bool = True,
index: bool | None = None,
indent: int = 0,
storage_options: StorageOptions = None,
mode: Literal["a", "w"] = "w",
) -> str | None:
if not index and orient not in ["split", "table"]:
if index is None and orient in ["records", "values"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this branch necessary? Or can you just change L151 to set it to True for appropriate orients?

index = False
elif index is None:
index = True

if not index and orient not in ["split", "table", "records", "values"]:
raise ValueError(
"'index=False' is only valid when 'orient' is 'split', 'table', "
"'records', or 'values'"
)

if index and orient in ["records", "values"]:
raise ValueError(
"'index=False' is only valid when 'orient' is 'split' or 'table'"
"'index=True' is only valid when 'orient' is 'split', 'table', "
"'index', or 'columns'. Convert index to column for other orients."
)

if lines and orient != "records":
Expand Down
23 changes: 20 additions & 3 deletions pandas/tests/io/json/test_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -1476,17 +1476,34 @@ def test_index_false_to_json_table(self, data):

assert result == expected

@pytest.mark.parametrize("orient", ["records", "index", "columns", "values"])
@pytest.mark.parametrize("orient", ["index", "columns"])
def test_index_false_error_to_json(self, orient):
# GH 17394
# GH 17394, 25513
# Testing error message from to_json with index=False

df = DataFrame([[1, 2], [4, 5]], columns=["a", "b"])

msg = "'index=False' is only valid when 'orient' is 'split' or 'table'"
msg = (
"'index=False' is only valid when 'orient' is 'split', "
"'table', 'records', or 'values'"
)
with pytest.raises(ValueError, match=msg):
df.to_json(orient=orient, index=False)

@pytest.mark.parametrize("orient", ["records", "values"])
def test_index_true_error_to_json(self, orient):
# GH 25513
# Testing error message from to_json with index=True

df = DataFrame([[1, 2], [4, 5]], columns=["a", "b"])

msg = (
"'index=True' is only valid when 'orient' is 'split', "
"'table', 'index', or 'columns'"
)
with pytest.raises(ValueError, match=msg):
df.to_json(orient=orient, index=True)

@pytest.mark.parametrize("orient", ["split", "table"])
@pytest.mark.parametrize("index", [True, False])
def test_index_false_from_json_to_json(self, orient, index):
Expand Down