You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importpandasaspdimportnumpyasnpimportmatplotlib.pyplotaspltpd.show_versions()
L=5index_wo_seconds=pd.date_range('1/1/2018', periods=L, freq='D')
ts1=pd.Series(np.random.randn(L), index=index_wo_seconds)
ts2=pd.Series(np.random.randn(L), index=index_wo_seconds)
ts3=ts1.shift(1, freq='s')
ts4=ts2.shift(1, freq='s')
# Different data, same index (with seconds)fig=plt.figure()
ax=fig.gca()
ts3.plot(ax=ax)
ts4.plot(ax=ax)
print('\n\n Time series in first plot:')
print(ts3)
print(ts4)
plt.savefig('repro-ts3-ts4.png')
fig=plt.figure()
ax=fig.gca()
ts1.plot(ax=ax)
ts4.plot(ax=ax)
print('\n\n Time series in second plot:')
print(ts1)
print(ts4)
plt.savefig('repro-ts1-ts4.png')
print('Repr of individual value in index in ts1: %s'% (repr(ts1.index[0]), ))
print('Repr of index in ts1: %s'% (repr(ts1.index), ))
print('Repr of individual value in index in ts4: %s'% (repr(ts4.index[0]), ))
print('Repr of index in ts4: %s'% (repr(ts4.index), ))
plt.show()
Problem description
When plotting two time series side-by-side in the same Axes object then something unexpected is happening when two two time series indices do not have the same "resolution".
First plot from repro:
Second plot from repro:
Output of repro:
$ python repro.py
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.16.7-200.fc27.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.4
pytest: None
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.15.0
scipy: None
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 1.0.5
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
Time series in first plot:
2018-01-01 00:00:01 0.120693
2018-01-02 00:00:01 -0.336999
2018-01-03 00:00:01 -0.847296
2018-01-04 00:00:01 0.110183
2018-01-05 00:00:01 -0.592696
Freq: D, dtype: float64
2018-01-01 00:00:01 -0.264522
2018-01-02 00:00:01 0.686366
2018-01-03 00:00:01 1.395553
2018-01-04 00:00:01 -1.253022
2018-01-05 00:00:01 0.946409
Freq: D, dtype: float64
Time series in second plot:
2018-01-01 0.120693
2018-01-02 -0.336999
2018-01-03 -0.847296
2018-01-04 0.110183
2018-01-05 -0.592696
Freq: D, dtype: float64
2018-01-01 00:00:01 -0.264522
2018-01-02 00:00:01 0.686366
2018-01-03 00:00:01 1.395553
2018-01-04 00:00:01 -1.253022
2018-01-05 00:00:01 0.946409
Freq: D, dtype: float64
Repr of individual value in index in ts1: Timestamp('2018-01-01 00:00:00', freq='D')
Repr of index in ts1: DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
'2018-01-05'],
dtype='datetime64[ns]', freq='D')
Repr of individual value in index in ts4: Timestamp('2018-01-01 00:00:01', freq='D')
Repr of index in ts4: DatetimeIndex(['2018-01-01 00:00:01', '2018-01-02 00:00:01',
'2018-01-03 00:00:01', '2018-01-04 00:00:01',
'2018-01-05 00:00:01'],
dtype='datetime64[ns]', freq='D')
In the second plot, I expect the two time series to be plotted with a time shift of one second (the two plots should basically look the same). Instead, we see something unexpected (some wrapping/overflow behavior?).
As a side note:
I would like to understand: where in the documentation is it documented that when I print() a Series that a line such as 2018-01-01 0.120693 (with the timestamp being given without hh:mm:ss) actually means 2018-01-01 00:00:00? This behavior makes perfect sense, and this is how pandas seems to consistently behave, but I had a hard time finding this documented.
What can I do so that print(ts) actually shows the 00:00:00 instead of hiding it? When I add a second via shift() or via adding an Offset() then it shows 00:00:01. When I remove it again then it falls back to not showing 00:00:00. The underlying data type does not seem to change (as far as I can tell based on dtype='datetime64[ns]', freq='D'). So, I thought maybe it really is just a question of the print/textual representation. Then again, the current issue at hand suggests that there is a difference between both (showing hh:mm:ss vs. not showing it) in the underlying data type.
Output of pd.show_versions()
See above, part of repro output.
The text was updated successfully, but these errors were encountered:
Hey is this a pandas issues? Regarding the plotting issue you first mentioned it seems like you are describing more of a matplotlib issue which you should raise with them?
If its a issue on the pandas side can you try reproduce in minimal code?
It appears that this may be more of a matplotlib issue than a pandas issue unless we can get a simpler example. Closing as it's not clear that this is a pandas issue but happy to reopen if it is.
Code Sample, a copy-pastable example if possible
Problem description
When plotting two time series side-by-side in the same
Axes
object then something unexpected is happening when two two time series indices do not have the same "resolution".First plot from repro:

Second plot from repro:

Output of repro:
In the second plot, I expect the two time series to be plotted with a time shift of one second (the two plots should basically look the same). Instead, we see something unexpected (some wrapping/overflow behavior?).
As a side note:
print()
aSeries
that a line such as2018-01-01 0.120693
(with the timestamp being given without hh:mm:ss) actually means2018-01-01 00:00:00
? This behavior makes perfect sense, and this is how pandas seems to consistently behave, but I had a hard time finding this documented.print(ts)
actually shows the00:00:00
instead of hiding it? When I add a second viashift()
or via adding anOffset()
then it shows00:00:01
. When I remove it again then it falls back to not showing00:00:00
. The underlying data type does not seem to change (as far as I can tell based ondtype='datetime64[ns]', freq='D'
). So, I thought maybe it really is just a question of the print/textual representation. Then again, the current issue at hand suggests that there is a difference between both (showing hh:mm:ss vs. not showing it) in the underlying data type.Output of
pd.show_versions()
See above, part of repro output.
The text was updated successfully, but these errors were encountered: