You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed a pretty significant performance hit when using theano shared variables. Please correct me if I'm doing something wrong. If this is a bug, I am happy to dig into this a bit more if someone can perhaps point me in the right direction
Please provide a minimal, self-contained, and reproducible example.
Interestingly, I ran this notebook on my laptop and compared the two versions on my fast_sample_posterior_predictive (see #3597), which is vectorized. I also see a slowdown, but the overall time cost is not as bad:
Without pm.Data:
1.66 s ± 15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) -- sample_posterior_predictive
1.54 ms ± 91.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) -- fast spp
With pm.Data:
31.5 s ± 1.19 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
13.1 ms ± 412 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I don't know theano very well. Am I right in believing that this means it does not do constant folding?
I would suspect that is expected, there are many optimizations that could be made if you know the length of an array. In Theano, they can't anticipate when a shared array will change its length. However, in PyMC3 we actually can as we know it will stay constant during inference. Not sure if there is a way to exploit that, however. We could ask the Theano guys.
Description of your problem
Hi,
I noticed a pretty significant performance hit when using theano shared variables. Please correct me if I'm doing something wrong. If this is a bug, I am happy to dig into this a bit more if someone can perhaps point me in the right direction
Please provide a minimal, self-contained, and reproducible example.
This returns:
1.66 s ± 15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
When I do the same thing but used theano shared variables I see the perf hit:
This results in:
31.7 s ± 498 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
My notebook for this is here
Versions and main components
The text was updated successfully, but these errors were encountered: