Skip to content

Series.resample('1S', how='last') on series with dtype=datetime64[ns] is very slow #7754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yonil7 opened this issue Jul 14, 2014 · 3 comments · Fixed by #10057
Closed

Series.resample('1S', how='last') on series with dtype=datetime64[ns] is very slow #7754

yonil7 opened this issue Jul 14, 2014 · 3 comments · Fixed by #10057
Labels
Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Performance Memory or execution speed performance Resample resample method
Milestone

Comments

@yonil7
Copy link

yonil7 commented Jul 14, 2014

example test:

intSeries = pd.Series(5, pd.date_range(start='2000-01-01', end='2000-01-05', freq='555000U'), dtype='int64')
timeSeries = intSeries.astype('datetime64[ns]')

time0 = time.time()
intSeries.resample('1S', how='last')
print 'resampling the int series took ', str(timedelta(seconds = time.time()-time0)), '(H:M:S)'

time0 = time.time()
timeSeries.resample('1S', how='last')
print 'resampling the datetime64[ns] series took ', str(timedelta(seconds = time.time()-time0)), '(H:M:S)'

output shows 14sec vs 0.5sec:
resampling the int series took 0:00:00.054000 (H:M:S)
resampling the datetime64[ns] series took 0:00:14.350000 (H:M:S)

@jreback
Copy link
Contributor

jreback commented Jul 14, 2014

yep...seems to be not taking a fast path. FYI, pls always post pd.show_versions(). welcome a pull-request to fix.

@jreback jreback added this to the 0.15.0 milestone Jul 14, 2014
@yonil7
Copy link
Author

yonil7 commented Jul 14, 2014

INSTALLED VERSIONS

commit: None
python: 2.7.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.14.0
nose: 1.3.0
Cython: 0.19.1
numpy: 1.8.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 1.0.0
sphinx: 1.1.3
patsy: 0.2.1
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.4
bottleneck: 0.8.0
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.3.1
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: 0.5.5
lxml: 3.3.5
bs4: 4.3.1
html5lib: None
bq: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.6
pymysql: None
psycopg2: None

@jreback jreback modified the milestones: 0.15.1, 0.15.0 Sep 9, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
jreback added a commit that referenced this issue May 12, 2015
ENH: Series.resample performance with datetime64[ns] #7754
@jreback
Copy link
Contributor

jreback commented May 12, 2015

closed by #10057

@jreback jreback closed this as completed May 12, 2015
@jreback jreback modified the milestones: 0.17.0, Next Major Release May 12, 2015
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.17.0, 0.16.2 Jun 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Dtype Conversions Unexpected or buggy dtype conversions Performance Memory or execution speed performance Resample resample method
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants