Skip to content

series.to_json + isoformat: bad serialization of naive dates (as utc stamps) #29706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zogzog opened this issue Nov 19, 2019 · 4 comments
Closed
Labels
Bug IO JSON read_json, to_json, json_normalize Timezones Timezone data dtype

Comments

@zogzog
Copy link

zogzog commented Nov 19, 2019

Exhibited there

from datetime import datetime
import pandas as pd


def test_json():
    series = pd.Series(
        [1., 2., 3.],
        index=pd.date_range(datetime(2020, 1, 1), freq='H', periods=3)
    )
    jsonseries = series.to_json(date_format='iso')
    assert jsonseries == (
        '{"2020-01-01T00:00:00.000Z":1.0,'
        '"2020-01-01T01:00:00.000Z":2.0,'
        '"2020-01-01T02:00:00.000Z":3.0}'
    )

    series2 = pd.read_json(jsonseries, typ='series', dtype=False)
    if pd.__version__.startswith('0.24'):
        assert not getattr(series2.index.dtype, 'tz', False)
        assert series.equals(series2)
    elif pd.__version__.startswith('0.25'):
        assert series2.index.dtype.tz.zone == 'UTC'
        assert not series.equals(series2)

Problem description

In pandas 0.25, " Bug in read_json() where date strings with Z were not converted to a UTC timezone (GH26168) " made me realize an issue with series timestamp iso serialization.

Naive dates should not be converted within an UTC referential.

@WillAyd
Copy link
Member

WillAyd commented Nov 19, 2019

This looks like an issue with to_json adding the UTC specifier. Can you try on master? I think this may already be fixed

@WillAyd WillAyd added IO JSON read_json, to_json, json_normalize Needs Info Clarification about behavior needed to assess issue labels Nov 19, 2019
zogzog added a commit to zogzog/tshistory that referenced this issue Feb 18, 2020
Pandas for a long time wrongly serialized naive timestamps
within an utc time zone (see pandas-dev/pandas#29706).
We work around the 0.25 fix that produces correct utc stamps
from such serialized stamps.
zogzog added a commit to zogzog/tshistory_client that referenced this issue Feb 18, 2020
With pandas 0.25 we get back utc stuff.

see pandas-dev/pandas#29706
@zogzog
Copy link
Author

zogzog commented Feb 18, 2020

Ok I'll have a look.

@simonjayhawkins
Copy link
Member

This looks like an issue with to_json adding the UTC specifier. Can you try on master? I think this may already be fixed

unchanged on master

>>> from datetime import datetime
>>> import pandas as pd
>>>
>>> pd.__version__
'1.1.0.dev0+1047.g927dedb87'
>>>
>>> series = pd.Series(
...     [1.0, 2.0, 3.0], index=pd.date_range(datetime(2020, 1, 1), freq="H", periods=3)
... )
>>> series.to_json(date_format="iso")
'{"2020-01-01T00:00:00.000Z":1.0,"2020-01-01T01:00:00.000Z":2.0,"2020-01-01T02:00:00.000Z":3.0}'
>>>

@simonjayhawkins simonjayhawkins added Bug and removed Needs Info Clarification about behavior needed to assess issue labels Apr 1, 2020
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Apr 1, 2020
@mroeschke mroeschke added the Timezones Timezone data dtype label Jul 23, 2021
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@lithomas1
Copy link
Member

This is fixed now, since
(I think was same issue as #38760)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

5 participants