Skip to content

Metropolis fails to converge on simple model (PyMC 3) #358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fonnesbeck opened this issue Oct 9, 2013 · 14 comments
Closed

Metropolis fails to converge on simple model (PyMC 3) #358

fonnesbeck opened this issue Oct 9, 2013 · 14 comments
Assignees
Labels

Comments

@fonnesbeck
Copy link
Member

I have attempted to replicate a very simple partial pooling model from Gelman and Hill in PyMC 3, but find that the Metropolis step method fails to converge even in a large number of iterations. NUTS seems to handle it fine, but there appears to be an issue with Metropolis. Perhaps related to adaptation?

Here is a link to a zip file containing the model and data.

@fonnesbeck
Copy link
Member Author

I have verified that the equivalent model in PyMC 2 converges to reasonable values under Metropolis (using Gelman and Hill's result as the gold standard), so something appears to be up with the PyMC 3 implementation.

Edited: figured out the NUTS issue

@jsalvatier
Copy link
Member

Any update?

@ghost ghost assigned fonnesbeck Dec 4, 2013
@fonnesbeck
Copy link
Member Author

No, still a mystery.

@bjedwards
Copy link
Contributor

Does the model still work? I get a AsTensorError on line 49, when trying to call y_hat = a[county]. I am running theano off the github-dev.

@aflaxman
Copy link
Contributor

aflaxman commented Dec 8, 2013

@bjedwards it was giving me this error also, and adding county = county.astype(int) before setting up the model made the error go away. I can confirm that Metropolis is not converging.

image

@fonnesbeck do you have a PyMC 2 version for comparison?

@aflaxman
Copy link
Contributor

Given long enough, it seems to work for me. I think I know what the issue is, but I'll decide for sure when I see what PyMC2 is doing for @fonnesbeck ---

image

@fonnesbeck
Copy link
Member Author

Sorry. Will have a go at this tonight.

@aflaxman
Copy link
Contributor

No worries. I was suspicious that the PyMC2 defaults would actually correspond more to a CompoundStep of Metropoli in PyMC3. However, that doesn't converge any faster:

with partial_pooling:
    steps = []
    steps.append(pm.Metropolis(vars=[mu_a], tune_interval=1000))
    steps.append(pm.Metropolis(vars=[sigma_a], tune_interval=1000))
    steps.append(pm.Metropolis(vars=[a], tune_interval=1000))
    steps.append(pm.Metropolis(vars=[sigma_y], tune_interval=1000))

    partial_pooling_samples = pm.sample(10000, pm.CompoundStep(steps))

image

@fonnesbeck
Copy link
Member Author

The following implementation in PyMC2 seems to work, using AdaptiveMetropolis for sampling the random intercepts:

def partial_pooling():

    # Priors
    mu_a = Normal('mu_a', mu=0., tau=0.0001, value=0)
    sigma_a = Uniform('sigma_a', lower=0, upper=100, value=1.)
    tau_a = sigma_a**-2

    # Random intercepts
    a = Normal('a', mu=mu_a, tau=tau_a, value=np.zeros(len(set(county))))

    # Model error
    sigma_y = Uniform('sigma_y', lower=0, upper=100)
    tau_y = sigma_y**-2

    # Expected value
    y_hat = Lambda('y_hat', lambda a=a: a[county])

    # Data likelihood
    y_like = Normal('y_like', mu=y_hat, tau=tau_y, value=log_radon, observed=True)

    return locals()

Going to go back and try again in PyMC3.

@twiecki
Copy link
Member

twiecki commented Aug 17, 2014

This works fine for me on this branch: #587 Please reopen if problem persists.

@twiecki twiecki closed this as completed Aug 17, 2014
@lwahedi
Copy link

lwahedi commented Mar 13, 2017

I ran into some trouble with this and wanted to post a note for later readers who are also having trouble. I made some toy data for a pretty simple partial pooling model based on the one discussed here, NUTS converges on the true values in not that many iterations (a few hundred), but very slowly. (I'm having some serious stalling issue on my actual data with NUTS.) For the first ten thousand or so iterations Metropolis settles over 0. At around 40k it's finally centered over the true value. At 100k, it's still a flat topped curve with a peak that isn't over the true value, and credible intervals that still cross 0. And this is with a pretty large effect size in the toy data. At 150k it starts to tighten, but still isn't really there yet for all the parameters. But 150k is still much faster than a couple hundred on NUTS,
TLDR: Metropolis needs a few hundred thousand iterations to converge on this model.

@twiecki
Copy link
Member

twiecki commented Mar 13, 2017

Thanks @lwahedi. Have you used ADVI initialization? I would also set tune=100 for NUTS. You can also investigate the sampler stats to identify difficult regions: http://pymc-devs.github.io/pymc3/notebooks/Diagnosing_biased_Inference_with_Divergences.html

@fonnesbeck
Copy link
Member Author

@lwahedi If you just run trace = sample(2000) (for example), PyMC will automatically select NUTS and initialize it with ADVI automatically.

@lwahedi
Copy link

lwahedi commented Mar 14, 2017

Thanks for the tips. ADVI does make this particular model run much faster. It still stalls when I add a probit transformation on the outcome, but I've now learned that Metropolis crashes the kernel at somewhere between 30k-50k iterations on the extended model too, so that wasn't the solution I'd hoped it would be. Either way, off topic for this issue. I have a stackexchange question with more detail about it here. . It may be a specification issue--not sure what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants