-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Plotting 128Hz timeseries crashes #20575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For me it luckily gives directly a MemoryError without crashing ... The reason for the large memory, is because it is trying to generate a huge arange:
What seems off in this case is the
But, then it should generate a range with a step of |
Thanks for the update, and for having a look into it! |
I tried to look into this but I didn't get that far. I believe that it might be a feature of the handling of the time index (x-labels, ticklabels etc) that is implemented. I don't think that it possibly that mult in So I believe that the implementation might be correct but that it doesn't work in practice if it is necessary to use nano seconds in the calculations when the actual sampling period is much larger. |
@fredrik-1 Thanks for looking into it. I think you are right that it currently is just due to how it is implemented: if the |
The question then is: how can this be solved? |
I tried to look a little more on the code. The problem or feature seems to be that TimeSeries_DateLocator (I don't know from where it is called, implicitely from matplotlib?) seems to use a freq variable which is a simple string with only the nano information. freq is then changed to be a pd.tseries.offsets.Nano object with n=1. The pd.tseries.offsets.Nano object in the actual data has n=7812500 |
@fredrik-1 I haven't read this issue closely, but if it helps pandas/pandas/plotting/_converter.py Line 981 in aa3fefc
Correct. I believe it's every time the figure is draw (so if it's interactively modified, it's called each time). |
I tried to debug some more. It seems that the actual frequency data (the 100 before milli for example) is thrown away several times in the code by the use of I also found that .to_period() doesn't work on data with a "special" frequency.
works as expected but The first one below works. The second one works but the frequency in the actual index is not equal to freq in the freq object.
Series with a DateTime or PeriodIndex with a frequency that is not equal to the standard frequencies, 1 second, 1 millisecond, 1 day etc don't seem to be supported in all functions. |
Thanks for digging. Any suggestion for how to proceed?
…On Fri, Apr 6, 2018 at 3:02 AM, fredrik-1 ***@***.***> wrote:
I tried to debug some more. It seems that the actual frequency data (the
100 before milli for example) is thrown away several times in the code by
the use of
freq=freq.rule_code
I also found that .to_period() doesn't work on data with a "special"
frequency.
import pandas as pd
import numpy as np
freq = 10
samples = freq * 1
data = np.random.random(samples)
index1 = pd.date_range(0, periods=samples, freq='{}S'.format(1/freq))
ts1 = pd.Series(data=data, index=index1, name='High freq')
index2 = pd.period_range(2000, periods=samples, freq='{}S'.format(1/freq))
ts2 = pd.Series(data=data, index=index2, name='High freq')
print(ts1.index)
print(ts2.index)
works as expected but
ts1.to_period()
throws an error because "100L" is not in
"_offset_to_period_map" in \pandas\pandas_libs\tslibs\offsets.pyx
The first on below works. The second one works but the frequency in the
actual index is not equal to freq in the freq object.
tsPeriod1=ts1.to_period(freq='100L')
print(tsPeriod1.index)
tsPeriod2=ts1.to_period(freq='L')
print(tsPeriod2.index)
Series with a DateTime or PeriodIndex with a frequency that is not equal
to the standard frequencies, 1 second, 1 millisecond, 1 day etc doesn't
seem to be supported in all functions.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#20575 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIhSRBsI0Dl8jnQW2tjhjEsVwWgI6ks5tlyEigaJpZM4TDNTg>
.
|
Code Sample
Problem description
Plotting a 10 second, 128Hz timeseries uses up all my RAM then crashes. 100Hz works fine, 128/150Hz (6666666N) crashes.
Expected Output
Ideally, a plot of my data. I wouldn't expect plotting 1280 points to require > 100GB of RAM! 😄
Output of
pd.show_versions()
pandas: 0.22.0
pytest: None
pip: 9.0.3
setuptools: 39.0.1
Cython: None
numpy: 1.14.0
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: 6.2.1
matplotlib: 2.2.2
The text was updated successfully, but these errors were encountered: