Re run marginalised gaussian notebook #4

MarcoGorelli · 2020-12-16T18:16:20Z

Use inferencedata, use pm.transforms.ordered to address non-identifiability

review-notebook-app · 2020-12-16T18:16:24Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

AlexAndorra · 2020-12-19T10:57:35Z

Is this ready for review?

MarcoGorelli · 2020-12-19T12:11:24Z

@AlexAndorra yup!

AlexAndorra

Just a small change to make it even better, and then it'll be ready for merge 😉

review-notebook-app · 2020-12-19T12:35:44Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-12-19T12:35:44Z
----------------------------------------------------------------

Even better, use az.from_pymc3_predictions(ppc_trace, idata_orig=trace, inplace=True) to integrate the PPC samples to the original InferenceData returned by pm.sample ;)

See https://arviz-devs.github.io/arviz/api/generated/arviz.from_pymc3_predictions.html#arviz.from_pymc3_predictions and https://docs.pymc.io/notebooks/multilevel_modeling.html (cell 51)

MarcoGorelli commented on 2020-12-19T13:22:52Z
----------------------------------------------------------------

Sure, thanks! Would the idea then be to use az.plot_posterior(trace, group='predictions')? Because if so, I'm getting lots of different subplots, which I think is less clear than plot_ppc - or is there a way to make a plot_ppc-like plot using this integrated InferenceData returned by from_pymc3_predictions?

AlexAndorra commented on 2020-12-19T18:31:45Z
----------------------------------------------------------------

Mmmh no, you should be able to just do az.plot_ppc(trace);

MarcoGorelli commented on 2020-12-19T19:26:31Z
----------------------------------------------------------------

I think I'm misunderstanding - for a small reproducible example, I tried

with pm.Model() as mcve:
    x = pm.Normal('x')
    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])
    trace = pm.sample(return_inferencedata=True)
    pp = pm.sample_posterior_predictive(trace=trace)
    trace = az.from_pymc3_predictions(pp, idata_orig=trace)
az.plot_ppc(trace)

and got

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
    182     for groups in ("{}_predictive".format(group), "observed_data"):
    183         if not hasattr(data, groups):
--> 184             raise TypeError(
    185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
    186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

AlexAndorra commented on 2020-12-19T19:54:36Z
----------------------------------------------------------------

Interesting... Does the following work?

pp = pm.sample_posterior_predictive(trace.posterior)
trace = az.from_pymc3_predictions(pp, idata_orig=trace)

(notice the .posterior on the first line)

Otherwise, does this work?

pp = pm.sample_posterior_predictive(trace=trace)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

And finally you can try:

pp = pm.sample_posterior_predictive(trace.posterior)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

I'm curious about which combination will work, and if that may actually reveal a bug

MarcoGorelli commented on 2020-12-20T08:04:10Z
----------------------------------------------------------------

None of them work for me. To reproduce, start with

import pymc3 as pm
import arviz as az
with pm.Model() as mcve:

    x = pm.Normal('x')

    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])

    trace = pm.sample(return_inferencedata=True)

Then, for the first code snippet you provided:

with mcve:

pp = pm.sample_posterior_predictive(trace.posterior)

trace = az.from_pymc3_predictions(pp, idata_orig=trace)

az.plot_ppc(trace)
100.00% [4000/4000 00:01<00:00]
TypeError                                 Traceback (most recent call last)

<ipython-input-2-6596b0316e80> in <module>

2     pp = pm.sample_posterior_predictive(trace.posterior)

3     trace = az.from_pymc3_predictions(pp, idata_orig=trace)

----> 4 az.plot_ppc(trace)
~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)

182     for groups in ("{}_predictive".format(group), "observed_data"):

183         if not hasattr(data, groups):

--> 184             raise TypeError(

185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)

186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the second one:

with mcve:

pp = pm.sample_posterior_predictive(trace=trace)

az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

az.plot_ppc(trace);
100.00% [4000/4000 00:01<00:00]
TypeError                                 Traceback (most recent call last)

<ipython-input-6-d6bca9a9645d> in <module>

2     pp = pm.sample_posterior_predictive(trace=trace)

3     az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

----> 4 az.plot_ppc(trace);
~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)

182     for groups in ("{}_predictive".format(group), "observed_data"):

183         if not hasattr(data, groups):

--> 184             raise TypeError(

185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)

186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the third one:

with mcve:

pp = pm.sample_posterior_predictive(trace.posterior)

az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

az.plot_ppc(trace)
100.00% [4000/4000 00:01<00:00]
TypeError                                 Traceback (most recent call last)

<ipython-input-8-2203cd440914> in <module>

2     pp = pm.sample_posterior_predictive(trace.posterior)

3     az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

----> 4 az.plot_ppc(trace)
~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)

182     for groups in ("{}_predictive".format(group), "observed_data"):

183         if not hasattr(data, groups):

--> 184             raise TypeError(

185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)

186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

AlexAndorra commented on 2020-12-20T14:24:36Z
----------------------------------------------------------------

Ok I think I got it!

from_pymc3_predictions creates the groups predictions and predictions_constant_data because these are out-of-sample predictions.

plot_ppc needs the group posterior_predictive because it handles in-sample predictions -- all this is quite logical, I could have guessed it earlier :D

So, in your minimal example, you just have to do:

with mcve:
   pp = pm.sample_posterior_predictive(trace, keep_size=True)
trace.add_groups(posterior_predictive=pp)

Now, if you display trace you should see the tab for the posterior_predictive group ;)

And now az.plot_ppc(trace); should work!

MarcoGorelli commented on 2020-12-20T21:08:04Z
----------------------------------------------------------------

Works now, thank you so much!

AlexAndorra commented on 2020-12-20T21:56:44Z
----------------------------------------------------------------

Thanks for persevering :D

review-notebook-app · 2020-12-19T12:35:45Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-12-19T12:35:44Z
----------------------------------------------------------------

And then you don't need the casting to InferenceData here ;)

MarcoGorelli · 2020-12-19T13:22:53Z

Sure, thanks! Would the idea then be to use az.plot_posterior(trace, group='predictions')? Because if so, I'm getting lots of different subplots, which I think is less clear than plot_ppc - or is there a way to make a plot_ppc-like plot using this integrated InferenceData returned by from_pymc3_predictions?

View entire conversation on ReviewNB

AlexAndorra · 2020-12-19T18:31:46Z

Mmmh no, you should be able to just do az.plot_ppc(trace);

View entire conversation on ReviewNB

MarcoGorelli · 2020-12-19T19:26:32Z

I think I'm misunderstanding - for a small reproducible example, I tried

with pm.Model() as mcve:
    x = pm.Normal('x')
    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])
    trace = pm.sample(return_inferencedata=True)
    pp = pm.sample_posterior_predictive(trace=trace)
    trace = az.from_pymc3_predictions(pp, idata_orig=trace)
az.plot_ppc(trace)

and got

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
    182     for groups in ("{}_predictive".format(group), "observed_data"):
    183         if not hasattr(data, groups):
--> 184             raise TypeError(
    185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
    186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

View entire conversation on ReviewNB

AlexAndorra · 2020-12-19T19:54:36Z

Interesting... Does the following work?

pp = pm.sample_posterior_predictive(trace.posterior)
trace = az.from_pymc3_predictions(pp, idata_orig=trace)

(notice the .posterior on the first line)

Otherwise, does this work?

pp = pm.sample_posterior_predictive(trace=trace)

az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

And finally you can try:

pp = pm.sample_posterior_predictive(trace.posterior)

az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

I'm curious about which combination will work, and if that may actually reveal a bug

View entire conversation on ReviewNB

MarcoGorelli · 2020-12-20T08:04:11Z

None of them work for me. To reproduce, start with

import pymc3 as pm
import arviz as az
with pm.Model() as mcve:

    x = pm.Normal('x')

    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])

    trace = pm.sample(return_inferencedata=True)

Then, for the first code snippet you provided:

with mcve:

pp = pm.sample_posterior_predictive(trace.posterior)

trace = az.from_pymc3_predictions(pp, idata_orig=trace)

az.plot_ppc(trace)
100.00% [4000/4000 00:01<00:00]
TypeError                                 Traceback (most recent call last)

<ipython-input-2-6596b0316e80> in <module>

2     pp = pm.sample_posterior_predictive(trace.posterior)

3     trace = az.from_pymc3_predictions(pp, idata_orig=trace)

----> 4 az.plot_ppc(trace)
~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)

182     for groups in ("{}_predictive".format(group), "observed_data"):

183         if not hasattr(data, groups):

--> 184             raise TypeError(

185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)

186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the second one:

with mcve:

pp = pm.sample_posterior_predictive(trace=trace)

az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

az.plot_ppc(trace);
100.00% [4000/4000 00:01<00:00]
TypeError                                 Traceback (most recent call last)

<ipython-input-6-d6bca9a9645d> in <module>

2     pp = pm.sample_posterior_predictive(trace=trace)

3     az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

----> 4 az.plot_ppc(trace);
~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)

182     for groups in ("{}_predictive".format(group), "observed_data"):

183         if not hasattr(data, groups):

--> 184             raise TypeError(

185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)

186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the third one:

with mcve:

pp = pm.sample_posterior_predictive(trace.posterior)

az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

az.plot_ppc(trace)
100.00% [4000/4000 00:01<00:00]
TypeError                                 Traceback (most recent call last)

<ipython-input-8-2203cd440914> in <module>

2     pp = pm.sample_posterior_predictive(trace.posterior)

3     az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

----> 4 az.plot_ppc(trace)
~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)

182     for groups in ("{}_predictive".format(group), "observed_data"):

183         if not hasattr(data, groups):

--> 184             raise TypeError(

185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)

186             )
TypeError: data argument must have the group "posterior_predictive" for ppcplot

View entire conversation on ReviewNB

AlexAndorra · 2020-12-20T14:24:37Z

Ok I think I got it!

from_pymc3_predictions creates the groups predictions and predictions_constant_data because these are out-of-sample predictions.

plot_ppc needs the group posterior_predictive because it handles in-sample predictions -- all this is quite logical, I could have guessed it earlier :D

So, in your minimal example, you just have to do:

with mcve:
   pp = pm.sample_posterior_predictive(trace, keep_size=True)
trace.add_groups(posterior_predictive=pp)

Now, if you display trace you should see the tab for the posterior_predictive group ;)

And now az.plot_ppc(trace); should work!

View entire conversation on ReviewNB

MarcoGorelli · 2020-12-20T21:08:04Z

Works now, thank you so much!

View entire conversation on ReviewNB

AlexAndorra · 2020-12-20T21:56:45Z

Thanks for persevering :D

View entire conversation on ReviewNB

MarcoGorelli force-pushed the re-run-marginalised-gaussian branch from 15f077d to d15dadc Compare December 17, 2020 17:34

rerun marginalised gaussian mixture model

a58cdb8

MarcoGorelli force-pushed the re-run-marginalised-gaussian branch from d15dadc to a58cdb8 Compare December 19, 2020 12:10

AlexAndorra requested changes Dec 19, 2020

View reviewed changes

use az.add_group

b91d20b

AlexAndorra approved these changes Dec 20, 2020

View reviewed changes

AlexAndorra merged commit a093bcc into pymc-devs:main Dec 20, 2020

MarcoGorelli deleted the re-run-marginalised-gaussian branch December 21, 2020 08:04

gbrunkhorst added a commit to gbrunkhorst/pymc-examples that referenced this pull request Jan 18, 2023

Revision pymc-devs#4 based on comments

8fd895c

Uh oh!

Re run marginalised gaussian notebook #4

Re run marginalised gaussian notebook #4

Uh oh!

Conversation

MarcoGorelli commented Dec 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 16, 2020

Uh oh!

AlexAndorra commented Dec 19, 2020

Uh oh!

MarcoGorelli commented Dec 19, 2020

Uh oh!

AlexAndorra left a comment

Choose a reason for hiding this comment

Uh oh!

review-notebook-app bot commented Dec 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

100.00% [4000/4000 00:01<00:00]

100.00% [4000/4000 00:01<00:00]

100.00% [4000/4000 00:01<00:00]

Uh oh!

review-notebook-app bot commented Dec 19, 2020

Uh oh!

MarcoGorelli commented Dec 19, 2020

Uh oh!

AlexAndorra commented Dec 19, 2020

Uh oh!

MarcoGorelli commented Dec 19, 2020

Uh oh!

AlexAndorra commented Dec 19, 2020

Uh oh!

MarcoGorelli commented Dec 20, 2020

100.00% [4000/4000 00:01<00:00]

100.00% [4000/4000 00:01<00:00]

100.00% [4000/4000 00:01<00:00]

Uh oh!

AlexAndorra commented Dec 20, 2020

Uh oh!

MarcoGorelli commented Dec 20, 2020

Uh oh!

AlexAndorra commented Dec 20, 2020

Uh oh!

Uh oh!

MarcoGorelli commented Dec 16, 2020 •

edited

Loading

review-notebook-app bot commented Dec 19, 2020 •

edited

Loading