Skip to content

PERF: cython optimizations #23477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 6, 2018
Merged

PERF: cython optimizations #23477

merged 9 commits into from
Nov 6, 2018

Conversation

jbrockmendel
Copy link
Member

cython was emitting some warnings about untyped array lookups in tslibs.conversion; added types to those

arr.fill(val) is a python call, while arr[:] is a C call, so replaced all of those

modernized an old non-python for-loop, some other small cleanups along the way.

We're running out of available optimizations/cleanups in these files. One thing I'd be interested in changing (but I'm not confident enough to do) is removing uses of double and double_t, which I think are equivalent to float64_t. But I'm not sure if that is architecture or platform dependent or something.

@codecov
Copy link

codecov bot commented Nov 4, 2018

Codecov Report

Merging #23477 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23477   +/-   ##
=======================================
  Coverage   92.23%   92.23%           
=======================================
  Files         161      161           
  Lines       51197    51197           
=======================================
  Hits        47220    47220           
  Misses       3977     3977
Flag Coverage Δ
#multiple 90.61% <ø> (ø) ⬆️
#single 42.27% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaaac86...a7b9aa0. Read the comment docs.

@gfyoung gfyoung added Performance Memory or execution speed performance Internals Related to non-user accessible pandas implementation labels Nov 4, 2018
@@ -177,7 +177,7 @@ def round_nsint64(values, mode, freq):

# if/elif above should catch all rounding modes defined in enum 'RoundTo':
# if flow of control arrives here, it is a bug
assert False, "round_nsint64 called with an unrecognized rounding mode"
raise ValueError("round_nsint64 called with an unrecognized rounding mode")
Copy link
Member

@gfyoung gfyoung Nov 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a def function, is this error tested in any way (or is this still internal)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make this an AssertionError

@gfyoung
Copy link
Member

gfyoung commented Nov 4, 2018

We're running out of available optimizations/cleanups in these files.

@jbrockmendel : Gosh, I wonder whose "FAULT" that is 😉

@jreback jreback added this to the 0.24.0 milestone Nov 4, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment.

@@ -177,7 +177,7 @@ def round_nsint64(values, mode, freq):

# if/elif above should catch all rounding modes defined in enum 'RoundTo':
# if flow of control arrives here, it is a bug
assert False, "round_nsint64 called with an unrecognized rounding mode"
raise ValueError("round_nsint64 called with an unrecognized rounding mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make this an AssertionError

@jbrockmendel
Copy link
Member Author

let's make this an AssertionError

Happy to do so, though now that I look closer, isn't this redundant with the check at the beginning of the function?

@jbrockmendel
Copy link
Member Author

BTW any idea about the double_t thing?

@jreback
Copy link
Contributor

jreback commented Nov 4, 2018

what is the problem with double_t ?
this is just float64

@jbrockmendel
Copy link
Member Author

what is the problem with double_t ? this is just float64

Yah that's the thing I want to double-check before changing all the places where we currently use double or double_t to float64_t

@jreback jreback merged commit 6a5c34c into pandas-dev:master Nov 6, 2018
@jreback
Copy link
Contributor

jreback commented Nov 6, 2018

thanks!

@jbrockmendel jbrockmendel deleted the libmore4 branch November 6, 2018 03:25
JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants