Skip to content

DEPR: 'epoch' date format in to_json #57987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Apr 19, 2024
7 changes: 0 additions & 7 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1949,13 +1949,6 @@ Writing in ISO date format, with microseconds:
json = dfd.to_json(date_format="iso", date_unit="us")
json

Epoch timestamps, in seconds:

.. ipython:: python

json = dfd.to_json(date_format="epoch", date_unit="s")
json

Writing to a file, with a date index and a date column:

.. ipython:: python
Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.2.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Bug fixes

Other
~~~~~
-
- Deprecated using ``epoch`` date format in :meth:`DataFrame.to_json` and :meth:`Series.to_json`, use ``iso`` instead.

.. ---------------------------------------------------------------------------
.. _whatsnew_222.contributors:
Expand Down
13 changes: 13 additions & 0 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2328,6 +2328,11 @@ def to_json(
'iso' = ISO8601. The default depends on the `orient`. For
``orient='table'``, the default is 'iso'. For all other orients,
the default is 'epoch'.

.. deprecated:: 2.2.2
'epoch' date format is deprecated and will be removed in a future
version, please use 'iso' instead.

double_precision : int, default 10
The number of decimal places to use when encoding
floating point values. The possible maximal value is 15.
Expand Down Expand Up @@ -2530,6 +2535,14 @@ def to_json(
date_format = "iso"
elif date_format is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think need to warn for anything that is not currently iso, including when date_format is None (although the message will be different)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a warning for the date_format=None case that previously defaulted to "epoch"; this should in the future default to "iso"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the to_json use cases don't involve dates and wouldn't be affected by the date_format value, throwing a warning in these cases might be unnecessary, essentially they will need to pass date_format='iso' for no reason to silence this warning, are you sure we should do this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be more specific we need to warn when date_format=None and we actually serialize timestamp types. I agree no point in warning if a DataFrame has no timestamp type, but if users are relying on the default epoch behavior they need to be warned of the change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd curious how would users get the old behavior? It would be good to add that in the warning message

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old behavior as in just an integer? I think the problem with that is it was an implementation detail of pandas spilling out into the JSON serializer. Historically our timestamps were exclusively nanoseconds since the Unix epoch, but with all the work @jbrockmendel has been doing that is no longer true (and _usually not true).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old behavior as in just an integer?

Yeah. Just checking if we can still offer a suggestion for a migration path if they want to keep the old behavior

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Especially with our auto-inferencing of resolutions I don't see how it would be usable at all roundtripping through JSON

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sounds good

date_format = "epoch"
elif date_format == "epoch":
# GH#57063
warnings.warn(
"'Epoch' date format is deprecated and will be removed in a future "
"version, please use 'iso' date format instead.",
FutureWarning,
stacklevel=find_stack_level(),
)

config.is_nonnegative_int(indent)
indent = indent or 0
Expand Down
11 changes: 8 additions & 3 deletions pandas/tests/io/json/test_json_table_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -451,12 +451,17 @@ def test_to_json_categorical_index(self):
assert result == expected

def test_date_format_raises(self, df_table):
msg = (
error_msg = (
"Trying to write with `orient='table'` and `date_format='epoch'`. Table "
"Schema requires dates to be formatted with `date_format='iso'`"
)
with pytest.raises(ValueError, match=msg):
df_table.to_json(orient="table", date_format="epoch")
warning_msg = (
"'Epoch' date format is deprecated and will be removed in a future "
"version, please use 'iso' date format instead."
)
with pytest.raises(ValueError, match=error_msg):
with tm.assert_produces_warning(FutureWarning, match=warning_msg):
df_table.to_json(orient="table", date_format="epoch")

# others work
df_table.to_json(orient="table", date_format="iso")
Expand Down
38 changes: 34 additions & 4 deletions pandas/tests/io/json/test_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -835,15 +835,23 @@ def test_date_index_and_values(self, date_format, as_object, date_typ):
data.append("a")

ser = Series(data, index=data)
result = ser.to_json(date_format=date_format)

expected_warning = None
if date_format == "epoch":
expected = '{"1577836800000":1577836800000,"null":null}'
expected_warning = FutureWarning
else:
expected = (
'{"2020-01-01T00:00:00.000":"2020-01-01T00:00:00.000","null":null}'
)

msg = (
"'Epoch' date format is deprecated and will be removed in a future "
"version, please use 'iso' date format instead."
)
with tm.assert_produces_warning(expected_warning, match=msg):
result = ser.to_json(date_format=date_format)

if as_object:
expected = expected.replace("}", ',"a":"a"}')

Expand Down Expand Up @@ -940,7 +948,12 @@ def test_date_unit(self, unit, datetime_frame):
df.iloc[2, dl] = Timestamp("21460101 20:43:42")
df.iloc[4, dl] = pd.NaT

json = df.to_json(date_format="epoch", date_unit=unit)
msg = (
"'Epoch' date format is deprecated and will be removed in a future "
"version, please use 'iso' date format instead."
)
with tm.assert_produces_warning(FutureWarning, match=msg):
json = df.to_json(date_format="epoch", date_unit=unit)

# force date unit
result = read_json(StringIO(json), date_unit=unit)
Expand Down Expand Up @@ -1106,17 +1119,24 @@ def test_timedelta_to_json(self, as_object, date_format, timedelta_typ):
data.append("a")

ser = Series(data, index=data)
expected_warning = None
if date_format == "iso":
expected = (
'{"P1DT0H0M0S":"P1DT0H0M0S","P2DT0H0M0S":"P2DT0H0M0S","null":null}'
)
else:
expected_warning = FutureWarning
expected = '{"86400000":86400000,"172800000":172800000,"null":null}'

if as_object:
expected = expected.replace("}", ',"a":"a"}')

result = ser.to_json(date_format=date_format)
msg = (
"'Epoch' date format is deprecated and will be removed in a future "
"version, please use 'iso' date format instead."
)
with tm.assert_produces_warning(expected_warning, match=msg):
result = ser.to_json(date_format=date_format)
assert result == expected

@pytest.mark.parametrize("as_object", [True, False])
Expand Down Expand Up @@ -1669,7 +1689,17 @@ def test_read_json_with_very_long_file_path(self, compression):
def test_timedelta_as_label(self, date_format, key):
df = DataFrame([[1]], columns=[pd.Timedelta("1D")])
expected = f'{{"{key}":{{"0":1}}}}'
result = df.to_json(date_format=date_format)

expected_warning = None
if date_format == "epoch":
expected_warning = FutureWarning

msg = (
"'Epoch' date format is deprecated and will be removed in a future "
"version, please use 'iso' date format instead."
)
with tm.assert_produces_warning(expected_warning, match=msg):
result = df.to_json(date_format=date_format)

assert result == expected

Expand Down