Skip to content

use fused types for parts of algos_common_helper #22452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 18, 2018

Conversation

jbrockmendel
Copy link
Member

Broken off of #22432, which was a proof of concept.

@gfyoung gfyoung added Refactor Internal refactoring of code Internals Related to non-user accessible pandas implementation labels Aug 21, 2018
Copy link
Member

@gfyoung gfyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return indexer


pad_float64 = pad["float64_t"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to explicitly use indexing here or is calling also an option? The latter may help reduce the amount of code required (maybe worth exploring in separate PR):

http://docs.cython.org/en/latest/src/userguide/fusedtypes.html#calling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to explicitly use indexing here or is calling also an option?

I think you're right that the calling can be simplified, but this way keeps the changes self-contained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea for sure - not saying do here but food for thought in subsequent changes or new developments.

return False, False, True

if algos_t is not object:
with nogil:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is just copy / paste of the below block, what do you think about creating a blank context manager that doesn't really do anything.

if algos_t is not object:
    cm = nogil
else:
    cm = dummy_context

And then either implementation can share the context manager:

with cm:
    ...

Would reduce the copy / paste and make it so we don't miss updates to say one part of the fused type in the future that makes for hard-to-find implementation differences. Could be generalizable in a few instances in Cython.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great if it can be made to work; a lot of the tempita code takes this form.

I don't think cython treats with nogil as an actual contextmanager. @scoder any thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, making this simpler would motivate this PR

Copy link

@scoder scoder Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, nogil is really special in Cython and not an actual context manager. I could imagine something like with nogil.only_if(some_bool_condition), which would then duplicate the with-block to generate a True and False version of it and select the right one at runtime. Or even at compile time, if the condition is a compile time constant, such as a type check on fused types (release GIL for native C types, keep for object types). The latter would definitely be possible, whereas duplicating the code block could be messy and is definitely more work.

Note that nogil is also a decorator, so nogil(something) is already taken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scoder thanks for the info! What would be the next steps to get the functionality you've described? Compile-time evaluation would certainly be ideal

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Read and discuss Allow (compile-time) conditional "with nogil" blocks cython/cython#2579
  2. find someone to implement it

:)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this will be released in Cython 3.0 - could have a few uses in our codebase for sure. Thanks!

@codecov
Copy link

codecov bot commented Aug 22, 2018

Codecov Report

Merging #22452 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #22452   +/-   ##
=======================================
  Coverage   92.05%   92.05%           
=======================================
  Files         169      169           
  Lines       50733    50733           
=======================================
  Hits        46702    46702           
  Misses       4031     4031
Flag Coverage Δ
#multiple 90.46% <ø> (ø) ⬆️
#single 42.24% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68273a7...749989f. Read the comment docs.

return False, False, True

if algos_t is not object:
with nogil:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, making this simpler would motivate this PR

return maybe_convert_objects(result)

{{endfor}}

#----------------------------------------------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's left here? possible to fully remove this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT diff_2d_dtpye is not straightforward to using fused types

@jreback
Copy link
Contributor

jreback commented Aug 22, 2018

perf check?

@jbrockmendel
Copy link
Member Author

perf check?

Still need to run, will update later in the day.

@jbrockmendel
Copy link
Member Author

I agree, making this simpler would motivate this PR

With this option it would be a slam dunk. Without it I still think this is preferable, but it's a closer call.

@jbrockmendel
Copy link
Member Author

asvs look on the good size of noise:

taskset 6 python2 -m asv continuous -f 1.1 -E virtualenv master HEAD -b monotonic -b pad -b fill

    before     after       ratio
  [25e6a21a] [749989fa]
+   41.98ms    71.36ms      1.70  strings.Methods.time_pad
+  421.04ns   495.46ns      1.18  categoricals.IsMonotonic.time_categorical_index_is_monotonic_increasing
+   36.21μs    42.47μs      1.17  categoricals.IsMonotonic.time_categorical_series_is_monotonic_increasing
-   55.63μs    50.47μs      0.91  categoricals.IsMonotonic.time_categorical_series_is_monotonic_decreasing
-  454.94ns   342.00ns      0.75  categoricals.IsMonotonic.time_categorical_index_is_monotonic_decreasing

    before     after       ratio
  [25e6a21a] [749989fa]
+   39.64μs    48.52μs      1.22  categoricals.IsMonotonic.time_categorical_series_is_monotonic_increasing
+  307.82ns   369.50ns      1.20  categoricals.IsMonotonic.time_categorical_index_is_monotonic_decreasing
-    3.54ms     3.10ms      0.88  replace.FillNa.time_fillna(False)
-   53.65μs    44.85μs      0.84  categoricals.IsMonotonic.time_categorical_series_is_monotonic_decreasing
-   95.93ms    79.04ms      0.82  join_merge.Align.time_series_align_left_monotonic
-    1.96ms     1.49ms      0.76  replace.FillNa.time_fillna(True)
-  455.37ns   309.91ns      0.68  categoricals.IsMonotonic.time_categorical_index_is_monotonic_increasing
-   72.43ms    41.78ms      0.58  strings.Methods.time_pad

before     after       ratio
  [25e6a21a] [749989fa]
+    1.50ms     1.87ms      1.24  replace.FillNa.time_fillna(True)
-  520.84ns   463.43ns      0.89  categoricals.IsMonotonic.time_categorical_index_is_monotonic_decreasing
-    7.20ms     6.15ms      0.86  frame_methods.Fillna.time_frame_fillna(False, 'pad')
-   61.60μs    43.60μs      0.71  categoricals.IsMonotonic.time_categorical_series_is_monotonic_decreasing
-  474.23ns   323.60ns      0.68  categoricals.IsMonotonic.time_categorical_index_is_monotonic_increasing
-   65.39ms    39.47ms      0.60  strings.Methods.time_pad

@jbrockmendel jbrockmendel mentioned this pull request Aug 29, 2018
4 tasks
@jbrockmendel
Copy link
Member Author

Looks like rebasing is gonna be a hassle. Before I jump into it, is the not-yet-implemented cython feature a deal-breaker? I prefer the implementation in this PR, but I'm not clear on what the consensus is.

@WillAyd
Copy link
Member

WillAyd commented Sep 7, 2018

I don’t consider it a deal breaker because I’d seen that same copy / paste in the templates before and may still be there in a few instances. Just my $.02

@jreback jreback added this to the 0.24.0 milestone Sep 18, 2018
@jreback jreback merged commit 8ff8f90 into pandas-dev:master Sep 18, 2018
@jreback
Copy link
Contributor

jreback commented Sep 18, 2018

thanks!

@jbrockmendel jbrockmendel deleted the temp1 branch September 18, 2018 13:52
aeltanawy pushed a commit to aeltanawy/pandas that referenced this pull request Sep 20, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants