Skip to content

Sample vp for Approximations, deprecate old ADVI #2027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Apr 22, 2017
Merged

Sample vp for Approximations, deprecate old ADVI #2027

merged 28 commits into from
Apr 22, 2017

Conversation

ferrine
Copy link
Member

@ferrine ferrine commented Apr 12, 2017

Following #2026

@@ -108,6 +108,9 @@ def advi(vars=None, start=None, model=None, n=5000, accurate_elbo=False,
and Blei, D. M. (2016). Automatic Differentiation Variational
Inference. arXiv preprint arXiv:1603.00788.
"""
import warnings
warnings.warn('Old ADVI interface is deprecated and be removed in future, use pm.ADVI instead',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and will be

@ferrine
Copy link
Member Author

ferrine commented Apr 13, 2017

Seems like lots of refactoring will be needed for init_nuts

@@ -108,6 +108,10 @@ def advi(vars=None, start=None, model=None, n=5000, accurate_elbo=False,
and Blei, D. M. (2016). Automatic Differentiation Variational
Inference. arXiv preprint arXiv:1603.00788.
"""
import warnings
warnings.warn('Old ADVI interface is deprecated and will '
'be removed in future, use pm.ADVI instead',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we tell people to use pm.fit() instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

raise NotImplementedError


class CheckLossConvergence(Callback):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very elegant.

We should also add the classic ADVI convergence criterion: https://github.com/pymc-devs/pymc3/blob/master/pymc3/variational/advi.py#L180

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@twiecki
Copy link
Member

twiecki commented Apr 14, 2017

@ferrine This starts to look pretty good. Why is init_nuts trickier than what you already have here?

@fonnesbeck
Copy link
Member

Do we want to do the sample_vp renaming here?

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

@fonnesbeck The only option is to rename to sample_approx it is long and it's short name is weird

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

@twiecki why do we need some strange tests for random states?

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

@twiecki init nuts became a problem because of start point for fine-tuning advi from map. I did not support that case.
I also did not have cunstructor for inference from it's approx. It is done badly here. But I didn't find more pretty solution.
So refactoring adds more constructors and start point. I also want to try svgd init

@fonnesbeck
Copy link
Member

fonnesbeck commented Apr 14, 2017

@ferrine sample_approx isn't too long. I think being adequately descriptive and intuitive is more important than the number of characters, in general. I thought we wanted to generalize _vp away from just variational posteriors?

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

@fonnesbeck good point. So method of Approximation will be renamed to just sample in order not to tell obvious things or misinform.

@fonnesbeck
Copy link
Member

@ferrine that's a good idea, I agree.

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

Comming back to the tests, they are failing. What for are these tests? https://github.com/pymc-devs/pymc3/blob/master/pymc3/tests/test_sampling.py#L37

@twiecki
Copy link
Member

twiecki commented Apr 14, 2017

That test was added when we fixed a bug where every parallel sampler generated the same samplers because they all used the same seed.

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

Hmm, interesting

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

@twiecki I can't find where I use global numpy rng in my code :(

@twiecki
Copy link
Member

twiecki commented Apr 14, 2017

@ferrine Does that test fail locally too?

@ferrine
Copy link
Member Author

ferrine commented Apr 14, 2017

yes, for sanity check I used the following code:

s1 = np.random.get_state()
approx = pm.fit(
            n=10, method='advi', model=nnet,
            callbacks=[pm.callbacks.CheckLossConvergence(every=1, window_size=2)]
        )
start = approx.sample(draws=4)
cov = approx.cov.eval()
s2 = np.random.get_state()
(s1[0] == s2[0]), (s1[1] == s2[1]).all(), (s1[2] == s2[2]), (s1[3] == s2[3])
--------------
True, True, True, True 

@twiecki
Copy link
Member

twiecki commented Apr 14, 2017

Good thing we have that test.

@ferrine ferrine mentioned this pull request Apr 19, 2017
@ferrine
Copy link
Member Author

ferrine commented Apr 20, 2017

I've added Inference module to docs, how can I check that it will be displayed?

@fonnesbeck
Copy link
Member

Will there be a progress bar for sample_vp?

@ferrine
Copy link
Member Author

ferrine commented Apr 20, 2017

Do you mean sample_approx? There is no need as it is vectorized and done only once for the whole sample size.

@ferrine
Copy link
Member Author

ferrine commented Apr 21, 2017

What about merge?

mapping {model_variable -> local_variable}
local_rv : dict[var->tuple]
Experimental for Empirical Distribution
mapping {model_variable -> local_variable (:math:`\\mu`, math:`\\rho`)}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@fonnesbeck fonnesbeck merged commit 2e3caa1 into pymc-devs:master Apr 22, 2017
@fonnesbeck
Copy link
Member

Looks great, thanks!

v_params = pm.variational.advi(n=n_init, start=start,
random_seed=random_seed)
cov = np.power(model.dict_to_array(v_params.stds), 2)
approx = pm.MeanField(model=model, start=start)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting meaningful differences with this change on this model: http://twiecki.github.io/blog/2017/03/14/random-walk-deep-net/

It does not abort when converged and sampling from this initialization is very slow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see any way of improvement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twiecki I was suggested to try running minimum stopping criteria. Interesting to see how it will affect performance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants