Plotting two time series in same Axes: Overflow/wrapping problem #22586

jgehrcke · 2018-09-04T14:25:47Z

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

pd.show_versions()

L = 5
index_wo_seconds = pd.date_range('1/1/2018', periods=L, freq='D')

ts1 = pd.Series(np.random.randn(L), index=index_wo_seconds)
ts2 = pd.Series(np.random.randn(L), index=index_wo_seconds)

ts3 = ts1.shift(1, freq='s')
ts4 = ts2.shift(1, freq='s')

# Different data, same index (with seconds)
fig = plt.figure()
ax = fig.gca()
ts3.plot(ax=ax)
ts4.plot(ax=ax)
print('\n\n Time series in first plot:')
print(ts3)
print(ts4)
plt.savefig('repro-ts3-ts4.png')

fig = plt.figure()
ax = fig.gca()
ts1.plot(ax=ax)
ts4.plot(ax=ax)
print('\n\n Time series in second plot:')
print(ts1)
print(ts4)
plt.savefig('repro-ts1-ts4.png')

print('Repr of individual value in index in ts1: %s' % (repr(ts1.index[0]), ))
print('Repr of index in ts1: %s' % (repr(ts1.index), ))
print('Repr of individual value in index in ts4: %s' % (repr(ts4.index[0]), ))
print('Repr of index in ts4: %s' % (repr(ts4.index), ))
plt.show()

Problem description

When plotting two time series side-by-side in the same Axes object then something unexpected is happening when two two time series indices do not have the same "resolution".

First plot from repro:

Second plot from repro:

Output of repro:

$ python repro.py 

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.16.7-200.fc27.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.23.4
pytest: None
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.15.0
scipy: None
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 1.0.5
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None


 Time series in first plot:
2018-01-01 00:00:01    0.120693
2018-01-02 00:00:01   -0.336999
2018-01-03 00:00:01   -0.847296
2018-01-04 00:00:01    0.110183
2018-01-05 00:00:01   -0.592696
Freq: D, dtype: float64
2018-01-01 00:00:01   -0.264522
2018-01-02 00:00:01    0.686366
2018-01-03 00:00:01    1.395553
2018-01-04 00:00:01   -1.253022
2018-01-05 00:00:01    0.946409
Freq: D, dtype: float64


 Time series in second plot:
2018-01-01    0.120693
2018-01-02   -0.336999
2018-01-03   -0.847296
2018-01-04    0.110183
2018-01-05   -0.592696
Freq: D, dtype: float64
2018-01-01 00:00:01   -0.264522
2018-01-02 00:00:01    0.686366
2018-01-03 00:00:01    1.395553
2018-01-04 00:00:01   -1.253022
2018-01-05 00:00:01    0.946409
Freq: D, dtype: float64
Repr of individual value in index in ts1: Timestamp('2018-01-01 00:00:00', freq='D')
Repr of index in ts1: DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05'],
              dtype='datetime64[ns]', freq='D')
Repr of individual value in index in ts4: Timestamp('2018-01-01 00:00:01', freq='D')
Repr of index in ts4: DatetimeIndex(['2018-01-01 00:00:01', '2018-01-02 00:00:01',
               '2018-01-03 00:00:01', '2018-01-04 00:00:01',
               '2018-01-05 00:00:01'],
              dtype='datetime64[ns]', freq='D')

In the second plot, I expect the two time series to be plotted with a time shift of one second (the two plots should basically look the same). Instead, we see something unexpected (some wrapping/overflow behavior?).

As a side note:

I would like to understand: where in the documentation is it documented that when I print() a Series that a line such as 2018-01-01 0.120693 (with the timestamp being given without hh:mm:ss) actually means 2018-01-01 00:00:00? This behavior makes perfect sense, and this is how pandas seems to consistently behave, but I had a hard time finding this documented.
What can I do so that print(ts) actually shows the 00:00:00 instead of hiding it? When I add a second via shift() or via adding an Offset() then it shows 00:00:01. When I remove it again then it falls back to not showing 00:00:00. The underlying data type does not seem to change (as far as I can tell based on dtype='datetime64[ns]', freq='D'). So, I thought maybe it really is just a question of the print/textual representation. Then again, the current issue at hand suggests that there is a difference between both (showing hh:mm:ss vs. not showing it) in the underlying data type.

Output of `pd.show_versions()`

See above, part of repro output.

The text was updated successfully, but these errors were encountered:

alimcmaster1 · 2018-09-04T23:36:14Z

Hey is this a pandas issues? Regarding the plotting issue you first mentioned it seems like you are describing more of a matplotlib issue which you should raise with them?

If its a issue on the pandas side can you try reproduce in minimal code?

Thanks

There's a lot of magic going on between how the datetime64 values actually encode datetime in plots. Sharing an axis across (sub)plots is brittle w.r.t. these differences. Work around this, here: make it so that individual timestamps have a non-zero value for seconds, by simply adding one second, shifting the whole data set by one second to the left. That prevents, I guess, an optimization to hit in which would see that individual timestamps hit the full hour or integer multiples of 30 or 15 minutes. Also see pandas-dev/pandas#15874 pandas-dev/pandas#15071 pandas-dev/pandas#31074 pandas-dev/pandas#29705 pandas-dev/pandas#29719 pandas-dev/pandas#18571 pandas-dev/pandas#11574 pandas-dev/pandas#22586

mroeschke · 2021-06-22T04:50:58Z

It appears that this may be more of a matplotlib issue than a pandas issue unless we can get a simpler example. Closing as it's not clear that this is a pandas issue but happy to reopen if it is.

jbrockmendel added the Visualization plotting label Sep 29, 2018

jgehrcke mentioned this issue Dec 7, 2020

Abstract figures. A multi-plot figure as a summary. Dragons. jgehrcke/ci-analysis#4

Merged

mroeschke closed this as completed Jun 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plotting two time series in same Axes: Overflow/wrapping problem #22586

Plotting two time series in same Axes: Overflow/wrapping problem #22586

jgehrcke commented Sep 4, 2018

alimcmaster1 commented Sep 4, 2018

mroeschke commented Jun 22, 2021

Plotting two time series in same Axes: Overflow/wrapping problem #22586

Plotting two time series in same Axes: Overflow/wrapping problem #22586

Comments

jgehrcke commented Sep 4, 2018

Code Sample, a copy-pastable example if possible

Problem description

Output of pd.show_versions()

alimcmaster1 commented Sep 4, 2018

mroeschke commented Jun 22, 2021

Output of `pd.show_versions()`