Skip to content

Alignment ops with irregular DatetimeIndex performance problems #1046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wesm opened this issue Apr 13, 2012 · 0 comments
Closed

Alignment ops with irregular DatetimeIndex performance problems #1046

wesm opened this issue Apr 13, 2012 · 0 comments
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@wesm
Copy link
Member

wesm commented Apr 13, 2012

Improperly boxing time stamps:

%prun -s cumulative result = left + right

         1000435 function calls (1000434 primitive calls) in 4.870 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.870    4.870 <string>:1(<module>)
        1    0.000    0.000    4.870    4.870 series.py:62(wrapper)
        1    0.000    0.000    4.866    4.866 series.py:1824(align)
        1    0.000    0.000    4.856    4.856 index.py:1561(join)
        1    0.001    0.001    4.856    4.856 index.py:710(join)
        1    0.254    0.254    4.778    4.778 index.py:1542(union)
        1    0.076    0.076    4.524    4.524 index.py:418(union)
        1    0.000    0.000    2.767    2.767 index.py:1830(__iter__)
        1    0.000    0.000    2.767    2.767 index.py:1464(asobject)
        1    0.203    0.203    2.767    2.767 datetools.py:34(_dt_box_array)
   500000    0.215    0.000    2.564    0.000 datetools.py:41(<lambda>)
   500000    2.349    0.000    2.349    0.000 datetools.py:28(_dt_box)
        1    0.000    0.000    0.901    0.901 index.py:473(_wrap_union_result)
        1    0.010    0.010    0.901    0.901 index.py:1197(__new__)
       18    0.890    0.049    0.890    0.049 {numpy.core.multiarray.array}
       16    0.000    0.000    0.890    0.056 numeric.py:167(asarray)
        1    0.743    0.743    0.743    0.743 {method 'sort' of 'list' objects}
        3    0.000    0.000    0.108    0.036 index.py:597(get_indexer)
        3    0.106    0.035    0.106    0.035 {method 'get_indexer' of 'pandas._engines.DatetimeEngine' objects}
        2    0.000    0.000    0.010    0.005 series.py:1860(_reindex_indexer)
        2    0.000    0.000    0.009    0.005 common.py:170(take_1d)
        2    0.007    0.004    0.007    0.004 {pandas._tseries.take_1d_float64}
        1    0.004    0.004    0.004    0.004 {method 'nonzero' of 'numpy.ndarray' objects}
        2    0.000    0.000    0.003    0.001 index.py:1866(equals)
        2    0.003    0.001    0.003    0.001 numeric.py:1927(array_equal)
        2    0.000    0.000    0.002    0.001 common.py:666(_ensure_int32)
        2    0.002    0.001    0.002    0.001 {method 'astype' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.002    0.002 series.py:47(na_op)
        1    0.002    0.002    0.002    0.002 {operator.add}
       13    0.000    0.000    0.001    0.000 index.py:1858(dtype)
       18    0.000    0.000    0.001    0.000 _internal.py:180(_datetimestring)
       18    0.001    0.000    0.001    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
        1    0.001    0.001    0.001    0.001 {method 'take' of 'numpy.ndarray'
wesm added a commit that referenced this issue Apr 13, 2012
@wesm wesm closed this as completed Apr 13, 2012
wesm added a commit that referenced this issue Apr 14, 2012
* timeseries: (200 commits)
  TST: don't use deprecated DateRange
  BUG: fix buglets surfacing from merge
  RLS: set released to false, bump dev version to 0.8.0
  BUG: fix major performance issue in DatetimeIndex.union affecting join performance on irregular indexes, remedying #1046
  ENH: add to_datetime method to Index, close #208
  ENH: legacy time rule support and refactoring, better alias handling. misc tests, #1041
  ENH: to_datetime will convert array of strings and NAs to datetime64 with NaT, close #999
  ENH: more datetime64 integration in core data algorithms per #996, close #1035
  ENH: handle datetime64 in block formation from dict of arrays in DataFrame constructor, close #1037
  BUG: fix broken time_rule usage in legacy DateRange, close #1036
  BUG: name inline method something different
  ENH: initial version of convert_to_annual for pandas, #736
  BUG: convert datetime64 -> datetime.datetime for matplotlib, close #1003
  ENH: integrate cython ohlc in groupby and test, close #152
  ENH: implement Cython OHLC function for groupby #152
  ENH: use cython bin groupers, fix bug in DatetimeIndex.__getitem causing slowness, some timeseries vbenches
  ENH: enable to_datetime to be vectorized, handle NAs, close #858
  TST: interactions between array of datetime objects and DatetimeIndex, bug fixes
  TST: remove errant foo and test_datetime64.py
  TST: moved test_datetime64.py tests to test_timeseries
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

1 participant