Skip to content

inferencedata.log_likelihood is summing observations #5236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ricardoV94 opened this issue Dec 3, 2021 · 2 comments · Fixed by #5245
Closed

inferencedata.log_likelihood is summing observations #5236

ricardoV94 opened this issue Dec 3, 2021 · 2 comments · Fixed by #5245

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 3, 2021

When talking with @lucianopaz I realized we completely broke log_likelihood computation in V4.

import pymc as pm
with pm.Model() as m:
    y = pm.Normal("y")
    x = pm.Normal("x", y, 1, observed=[5, 2])    
    idata = pm.sample(tune=5, draws=5, chains=2)
print(idata.log_likelihood['x'].values.shape)
# (2, 5, 1)

Whereas in V3:

import pymc3 as pm
with pm.Model() as m:
    y = pm.Normal("y")
    x = pm.Normal("x", y, 1, observed=[5, 2])    
    idata = pm.sample(tune=5, draws=5, chains=2, return_inferencedata=True)
print(idata.log_likelihood['x'].values.shape)
# (2, 5, 2)

This happened because the default model.logpt now returns the summed logp by default whereas before it returned the vectorized logp by default. The change was done in 0a172c8

Although that is a more sane default, we have to reintroduce an easy helper logp_elemwiset (I think this is pretty much broken right now as well) which calls logpt with sum=False.

Also in this case we might want to just return the logprob terms as the dictionary items that are returned by aeppl.factorized_joint_lopgrob and let the end-user decide how he wants to combine them. These keys contain {value variable: logp term}. The default of calling at.add on all variables when sum=False is seldom useful (that's why we switched the default), due to potential unwanted broadcasting across variables with different dimensions.

One extra advantage of returning the dictionary items is that we don't need to create nearly duplicated graphs for each observed variable when computing the log-likelihood here:

cached = [(var, self.model.fn(logpt(var))) for var in self.model.observed_RVs]

We can request it for any number of observed variables at the same time, and then simply compile a function that has each variable logp term as an output, but otherwise shares the common nodes, saving on compilation, computation and memory footprint, when a model has more than one observed variable.

For instance, this nested loop would no longer be needed:

pymc/pymc/backends/arviz.py

Lines 276 to 282 in fe2d101

for var, log_like_fun in cached:
for k, chain in enumerate(trace.chains):
log_like_chain = [
self.log_likelihood_vals_point(point, var, log_like_fun)
for point in trace.points([chain])
]
log_likelihood_dict.insert(var.name, np.stack(log_like_chain), k)

CC @OriolAbril

@ricardoV94 ricardoV94 added the bug label Dec 3, 2021
@ricardoV94 ricardoV94 added this to the v4.0.0-beta1 (vNext) milestone Dec 3, 2021
@ricardoV94 ricardoV94 added v4 trace-backend Traces and ArviZ stuff labels Dec 3, 2021
@ricardoV94 ricardoV94 changed the title inferencedata.log_likelihood is accumulated across observations inferencedata.log_likelihood is summing observations Dec 3, 2021
@OriolAbril
Copy link
Member

OriolAbril commented Dec 4, 2021

this nested loop would no longer be needed

That was originally the goal in #4489. My Aesara knowledge was (and still is) very limited so after a while of being stuck @brandonwillard took over and kept the nested loops. It seems like the description already outlies a clear path forward but he might also have extra insight on this.

@brandonwillard
Copy link
Contributor

That was originally the goal in #4489. My Aesara knowledge was (and still is) very limited so after a while of being stuck @brandonwillard took over and kept the nested loops. It seems like the description already outlies a clear path forward but he might also have extra insight on this.

A lot has changed since my original v4 branch, so I can't imagine that many/any considerations based on #4489 will be relevant now. Regardless, if the description above is correct, it would appear as though the problem is due to other changes that now require an update to this logic.

Aside from that, @ricardoV94 seems to be proposing some potential paths for improvement. If so, they involve design decisions that need to be considered carefully by the people responsible for making them.

If there are any questions about how the basic machinery works, don't hesitate to ask; otherwise, I don't know how else I can help here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants