PERF: typing and cdefs for tslibs.resolution #21452

jbrockmendel · 2018-06-12T19:28:29Z

Moving things lower-level will help improve performance (due in part to better Cython compilation).

WillAyd · 2018-06-12T20:32:01Z

pandas/_libs/tslibs/resolution.pyx

    return us % mult == 0


-def _maybe_add_count(base, count):
+cdef inline str _maybe_add_count(str base, count):


Can we add a int64_t type for count here?

Ill double-check, but I'm pretty sure past-me had a reason for leaving that one out. Maybe there are cases when None is passed

WillAyd · 2018-06-12T20:33:03Z

pandas/_libs/tslibs/resolution.pyx

    """
    Not sure if I can avoid the state machine here
    """
+    cdef public:
+        index


Do we need types here?

Not really viable since we can't specify Index as a type. We just have to make the name public for wheels to turn.

just specify them as object to make this explict

WillAyd · 2018-06-12T20:33:31Z

pandas/_libs/tslibs/resolution.pyx

+            bint calendar_start = True
+            bint business_start = True
+            bint cal
+            int32_t[:] years


These are memoryviews, right? Any reason you are choosing these instead of the ndarray declarations throughout the rest of the code base?

My understanding is that memoryviews are encouraged because they are lighter weight.

@jbrockmendel : That is indeed correct. I agree with this design choice.

Might be something to look into for other parts of the code.

In general, one of the main problems with memoryviews is with read-only arrays (and for that reason we do/should not use them solely in certain algos), but I don't think that's relevant in this case

you can make cdefs readonly when it makes sense FYI

WillAyd · 2018-06-12T20:33:52Z

pandas/_libs/tslibs/resolution.pyx

    return us % mult == 0


-def _maybe_add_count(base, count):
+cdef inline str _maybe_add_count(str base, count):
    if count != 1:
        return '{count}{base}'.format(count=int(count), base=base)


If above comment is correct shouldn't need the int cast here

Looks like you're right. Just pushed a commit that added the suggested typing and removed the int(. (and hopefully fixed the test errors)

codecov · 2018-06-13T06:29:56Z

Codecov Report

Merging #21452 into master will increase coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #21452      +/-   ##
==========================================
+ Coverage   91.89%    91.9%   +<.01%     
==========================================
  Files         153      153              
  Lines       49600    49606       +6     
==========================================
+ Hits        45580    45589       +9     
+ Misses       4020     4017       -3

Flag	Coverage Δ
#multiple	`90.3% <ø> (ø)`	⬆️
#single	`41.89% <ø> (+0.02%)`	⬆️

Impacted Files	Coverage Δ
pandas/core/indexes/datetimes.py	`95.8% <0%> (ø)`	⬆️
pandas/core/indexes/base.py	`96.62% <0%> (ø)`	⬆️
pandas/core/indexes/category.py	`97.06% <0%> (+0.03%)`	⬆️
pandas/tseries/offsets.py	`97.24% <0%> (+0.24%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ab668b0...bf67f17. Read the comment docs.

gfyoung · 2018-06-13T06:34:48Z

@jbrockmendel : I feel like the performance help is relatively obvious given that you are moving things underneath the Python level. Is there any asv that could illustrate this by any chance (not a blocker, but just curious)? If so, that might be worthwhile to include in a whatsnew.

jreback

minor comments. just run asv's to confirm no regressions. of course if improvements then shout it out!
ping on green after updating for comments.

jreback · 2018-06-13T10:34:24Z

pandas/_libs/tslibs/resolution.pyx

    """
    Not sure if I can avoid the state machine here
    """
+    cdef public:
+        index


just specify them as object to make this explict

jreback · 2018-06-13T10:35:12Z

pandas/_libs/tslibs/resolution.pyx

+            bint calendar_start = True
+            bint business_start = True
+            bint cal
+            int32_t[:] years


you can make cdefs readonly when it makes sense FYI

jbrockmendel · 2018-06-13T23:26:28Z

%timeit results are insignificantly sped up:

now = pd.Timestamp.now()
dti = pd.DatetimeIndex([now + pd.Timedelta(seconds=19*n) for n in range(10**5)])
dti.inferred_freq  # creates _cache attr

def infer():
    dti._cache.clear()
    return dti.inferred_freq

Status quo results for %timeit infer(), in microseconds:
958, 957, 961, 964, 956, 959, 964

PR:
962, 958, 954, 953, 954, 956, 957

Note that in 0.23.0 it looks like these come back closer to 832. No idea what changed in the interim.

jreback · 2018-06-14T10:12:05Z

thanks!

jbrockmendel added 2 commits June 11, 2018 22:01

cythonize parts of resolution

c77371f

add more types

84d9114

WillAyd reviewed Jun 12, 2018

View reviewed changes

cdef TimedeltaFrequencyInferrer to fix test errors

d2dc109

gfyoung added Datetime Datetime data dtype Internals Related to non-user accessible pandas implementation Performance Memory or execution speed performance labels Jun 13, 2018

gfyoung changed the title ~~typing and cdefs for tslibs.resolution~~ PERF: typing and cdefs for tslibs.resolution Jun 13, 2018

jreback requested changes Jun 13, 2018

View reviewed changes

jreback added this to the 0.24.0 milestone Jun 13, 2018

type as objects explicitly

bf67f17

jreback approved these changes Jun 14, 2018

View reviewed changes

jreback merged commit 6028a5b into pandas-dev:master Jun 14, 2018

david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018

PERF: typing and cdefs for tslibs.resolution (pandas-dev#21452)

60408a6

jbrockmendel deleted the cyres branch June 22, 2018 03:27

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

PERF: typing and cdefs for tslibs.resolution (pandas-dev#21452)

5bec9bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: typing and cdefs for tslibs.resolution #21452

PERF: typing and cdefs for tslibs.resolution #21452

jbrockmendel commented Jun 12, 2018 •

edited by gfyoung

Loading

WillAyd Jun 12, 2018

jbrockmendel Jun 12, 2018

WillAyd Jun 12, 2018

jbrockmendel Jun 12, 2018

jreback Jun 13, 2018

WillAyd Jun 12, 2018

jbrockmendel Jun 12, 2018

gfyoung Jun 13, 2018 •

edited

Loading

jorisvandenbossche Jun 13, 2018

jreback Jun 13, 2018

WillAyd Jun 12, 2018

jbrockmendel Jun 13, 2018

codecov bot commented Jun 13, 2018 •

edited

Loading

gfyoung commented Jun 13, 2018

jreback left a comment

jreback Jun 13, 2018

jreback Jun 13, 2018

jbrockmendel commented Jun 13, 2018

jreback commented Jun 14, 2018

PERF: typing and cdefs for tslibs.resolution #21452

PERF: typing and cdefs for tslibs.resolution #21452

Conversation

jbrockmendel commented Jun 12, 2018 • edited by gfyoung Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung Jun 13, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jun 13, 2018 • edited Loading

Codecov Report

gfyoung commented Jun 13, 2018

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Jun 13, 2018

jreback commented Jun 14, 2018

jbrockmendel commented Jun 12, 2018 •

edited by gfyoung

Loading

gfyoung Jun 13, 2018 •

edited

Loading

codecov bot commented Jun 13, 2018 •

edited

Loading