Skip to content

Re run marginalised gaussian notebook #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 20, 2020

Conversation

MarcoGorelli
Copy link
Contributor

@MarcoGorelli MarcoGorelli commented Dec 16, 2020

Use inferencedata, use pm.transforms.ordered to address non-identifiability

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@MarcoGorelli MarcoGorelli force-pushed the re-run-marginalised-gaussian branch from 15f077d to d15dadc Compare December 17, 2020 17:34
@AlexAndorra
Copy link
Collaborator

Is this ready for review?

@MarcoGorelli MarcoGorelli force-pushed the re-run-marginalised-gaussian branch from d15dadc to a58cdb8 Compare December 19, 2020 12:10
@MarcoGorelli
Copy link
Contributor Author

@AlexAndorra yup!

Copy link
Collaborator

@AlexAndorra AlexAndorra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small change to make it even better, and then it'll be ready for merge 😉

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 19, 2020

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-12-19T12:35:44Z
----------------------------------------------------------------

Even better, use az.from_pymc3_predictions(ppc_trace, idata_orig=trace, inplace=True) to integrate the PPC samples to the original InferenceData returned by pm.sample ;)

See https://arviz-devs.github.io/arviz/api/generated/arviz.from_pymc3_predictions.html#arviz.from_pymc3_predictions and https://docs.pymc.io/notebooks/multilevel_modeling.html (cell 51)


MarcoGorelli commented on 2020-12-19T13:22:52Z
----------------------------------------------------------------

Sure, thanks! Would the idea then be to use az.plot_posterior(trace, group='predictions')? Because if so, I'm getting lots of different subplots, which I think is less clear than plot_ppc - or is there a way to make a plot_ppc-like plot using this integrated InferenceData returned by from_pymc3_predictions?

AlexAndorra commented on 2020-12-19T18:31:45Z
----------------------------------------------------------------

Mmmh no, you should be able to just do az.plot_ppc(trace); 

MarcoGorelli commented on 2020-12-19T19:26:31Z
----------------------------------------------------------------

I think I'm misunderstanding - for a small reproducible example, I tried

with pm.Model() as mcve:
    x = pm.Normal('x')
    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])
    trace = pm.sample(return_inferencedata=True)
    pp = pm.sample_posterior_predictive(trace=trace)
    trace = az.from_pymc3_predictions(pp, idata_orig=trace)
az.plot_ppc(trace)

and got

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
    182     for groups in ("{}_predictive".format(group), "observed_data"):
    183         if not hasattr(data, groups):
--> 184             raise TypeError(
    185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
    186             )

TypeError: data argument must have the group "posterior_predictive" for ppcplot

AlexAndorra commented on 2020-12-19T19:54:36Z
----------------------------------------------------------------

Interesting... Does the following work?

pp = pm.sample_posterior_predictive(trace.posterior)
trace = az.from_pymc3_predictions(pp, idata_orig=trace)

(notice the .posterior on the first line)

Otherwise, does this work?

pp = pm.sample_posterior_predictive(trace=trace)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

And finally you can try:

pp = pm.sample_posterior_predictive(trace.posterior)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

I'm curious about which combination will work, and if that may actually reveal a bug

MarcoGorelli commented on 2020-12-20T08:04:10Z
----------------------------------------------------------------

None of them work for me. To reproduce, start with

import pymc3 as pm
import arviz as az

with pm.Model() as mcve:
    x = pm.Normal('x')
    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])
    trace = pm.sample(return_inferencedata=True)

Then, for the first code snippet you provided:

with mcve:
pp = pm.sample_posterior_predictive(trace.posterior)
trace = az.from_pymc3_predictions(pp, idata_orig=trace)
az.plot_ppc(trace)

100.00% [4000/4000 00:01<00:00]

TypeError Traceback (most recent call last)
<ipython-input-2-6596b0316e80> in <module>
2 pp = pm.sample_posterior_predictive(trace.posterior)
3 trace = az.from_pymc3_predictions(pp, idata_orig=trace)
----> 4 az.plot_ppc(trace)

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
182 for groups in ("{}_predictive".format(group), "observed_data"):
183 if not hasattr(data, groups):
--> 184 raise TypeError(
185 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
186 )

TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the second one:

with mcve:
pp = pm.sample_posterior_predictive(trace=trace)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
az.plot_ppc(trace);

100.00% [4000/4000 00:01<00:00]

TypeError Traceback (most recent call last)
<ipython-input-6-d6bca9a9645d> in <module>
2 pp = pm.sample_posterior_predictive(trace=trace)
3 az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
----> 4 az.plot_ppc(trace);

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
182 for groups in ("{}_predictive".format(group), "observed_data"):
183 if not hasattr(data, groups):
--> 184 raise TypeError(
185 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
186 )

TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the third one:

with mcve:
pp = pm.sample_posterior_predictive(trace.posterior)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
az.plot_ppc(trace)

100.00% [4000/4000 00:01<00:00]

TypeError Traceback (most recent call last)
<ipython-input-8-2203cd440914> in <module>
2 pp = pm.sample_posterior_predictive(trace.posterior)
3 az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
----> 4 az.plot_ppc(trace)

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
182 for groups in ("{}_predictive".format(group), "observed_data"):
183 if not hasattr(data, groups):
--> 184 raise TypeError(
185 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
186 )

TypeError: data argument must have the group "posterior_predictive" for ppcplot

AlexAndorra commented on 2020-12-20T14:24:36Z
----------------------------------------------------------------

Ok I think I got it!

from_pymc3_predictions creates the groups predictions and predictions_constant_data because these are out-of-sample predictions.

plot_ppc needs the group posterior_predictive because it handles in-sample predictions -- all this is quite logical, I could have guessed it earlier :D

So, in your minimal example, you just have to do:

with mcve:
   pp = pm.sample_posterior_predictive(trace, keep_size=True)
trace.add_groups(posterior_predictive=pp)

Now, if you display trace you should see the tab for the posterior_predictive group ;)

And now az.plot_ppc(trace); should work!

MarcoGorelli commented on 2020-12-20T21:08:04Z
----------------------------------------------------------------

Works now, thank you so much!

AlexAndorra commented on 2020-12-20T21:56:44Z
----------------------------------------------------------------

Thanks for persevering :D

@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2020-12-19T12:35:44Z
----------------------------------------------------------------

And then you don't need the casting to InferenceData here ;)


Copy link
Contributor Author

Sure, thanks! Would the idea then be to use az.plot_posterior(trace, group='predictions')? Because if so, I'm getting lots of different subplots, which I think is less clear than plot_ppc - or is there a way to make a plot_ppc-like plot using this integrated InferenceData returned by from_pymc3_predictions?


View entire conversation on ReviewNB

Copy link
Collaborator

Mmmh no, you should be able to just do az.plot_ppc(trace); 


View entire conversation on ReviewNB

Copy link
Contributor Author

I think I'm misunderstanding - for a small reproducible example, I tried

with pm.Model() as mcve:
    x = pm.Normal('x')
    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])
    trace = pm.sample(return_inferencedata=True)
    pp = pm.sample_posterior_predictive(trace=trace)
    trace = az.from_pymc3_predictions(pp, idata_orig=trace)
az.plot_ppc(trace)

and got

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
    182     for groups in ("{}_predictive".format(group), "observed_data"):
    183         if not hasattr(data, groups):
--> 184             raise TypeError(
    185                 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
    186             )

TypeError: data argument must have the group "posterior_predictive" for ppcplot


View entire conversation on ReviewNB

Copy link
Collaborator

    

Interesting... Does the following work?

pp = pm.sample_posterior_predictive(trace.posterior)

trace = az.from_pymc3_predictions(pp, idata_orig=trace)

(notice the .posterior on the first line)

Otherwise, does this work?

pp = pm.sample_posterior_predictive(trace=trace)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

And finally you can try:

pp = pm.sample_posterior_predictive(trace.posterior)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)

I'm curious about which combination will work, and if that may actually reveal a bug


View entire conversation on ReviewNB

Copy link
Contributor Author

None of them work for me. To reproduce, start with

import pymc3 as pm
import arviz as az

with pm.Model() as mcve:
    x = pm.Normal('x')
    y = pm.Normal('y', mu=x, observed=[1,2,3,4,5])
    trace = pm.sample(return_inferencedata=True)

Then, for the first code snippet you provided:

with mcve:
pp = pm.sample_posterior_predictive(trace.posterior)
trace = az.from_pymc3_predictions(pp, idata_orig=trace)
az.plot_ppc(trace)

100.00% [4000/4000 00:01<00:00]

TypeError Traceback (most recent call last)
<ipython-input-2-6596b0316e80> in <module>
2 pp = pm.sample_posterior_predictive(trace.posterior)
3 trace = az.from_pymc3_predictions(pp, idata_orig=trace)
----> 4 az.plot_ppc(trace)

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
182 for groups in ("{}_predictive".format(group), "observed_data"):
183 if not hasattr(data, groups):
--> 184 raise TypeError(
185 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
186 )

TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the second one:

with mcve:
pp = pm.sample_posterior_predictive(trace=trace)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
az.plot_ppc(trace);

100.00% [4000/4000 00:01<00:00]

TypeError Traceback (most recent call last)
<ipython-input-6-d6bca9a9645d> in <module>
2 pp = pm.sample_posterior_predictive(trace=trace)
3 az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
----> 4 az.plot_ppc(trace);

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
182 for groups in ("{}_predictive".format(group), "observed_data"):
183 if not hasattr(data, groups):
--> 184 raise TypeError(
185 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
186 )

TypeError: data argument must have the group "posterior_predictive" for ppcplot

For the third one:

with mcve:
pp = pm.sample_posterior_predictive(trace.posterior)
az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
az.plot_ppc(trace)

100.00% [4000/4000 00:01<00:00]

TypeError Traceback (most recent call last)
<ipython-input-8-2203cd440914> in <module>
2 pp = pm.sample_posterior_predictive(trace.posterior)
3 az.from_pymc3_predictions(pp, idata_orig=trace, inplace=True)
----> 4 az.plot_ppc(trace)

~/miniconda3/envs/pymc3-dev-py38/lib/python3.8/site-packages/arviz/plots/ppcplot.py in plot_ppc(data, kind, alpha, mean, color, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
182 for groups in ("{}_predictive".format(group), "observed_data"):
183 if not hasattr(data, groups):
--> 184 raise TypeError(
185 'data argument must have the group "{group}" for ppcplot'.format(group=groups)
186 )

TypeError: data argument must have the group "posterior_predictive" for ppcplot


View entire conversation on ReviewNB

Copy link
Collaborator

Ok I think I got it!

from_pymc3_predictions creates the groups predictions and predictions_constant_data because these are out-of-sample predictions.

plot_ppc needs the group posterior_predictive because it handles in-sample predictions -- all this is quite logical, I could have guessed it earlier :D

So, in your minimal example, you just have to do:

with mcve:
   pp = pm.sample_posterior_predictive(trace, keep_size=True)
trace.add_groups(posterior_predictive=pp)

Now, if you display trace you should see the tab for the posterior_predictive group ;)

And now az.plot_ppc(trace); should work!


View entire conversation on ReviewNB

Copy link
Contributor Author

Works now, thank you so much!


View entire conversation on ReviewNB

Copy link
Collaborator

Thanks for persevering :D


View entire conversation on ReviewNB

@AlexAndorra AlexAndorra merged commit a093bcc into pymc-devs:main Dec 20, 2020
@MarcoGorelli MarcoGorelli deleted the re-run-marginalised-gaussian branch December 21, 2020 08:04
gbrunkhorst added a commit to gbrunkhorst/pymc-examples that referenced this pull request Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants