Skip to content

Possible bug in LKJ ?? #873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
agoldin opened this issue Nov 12, 2015 · 12 comments
Closed

Possible bug in LKJ ?? #873

agoldin opened this issue Nov 12, 2015 · 12 comments

Comments

@agoldin
Copy link
Contributor

agoldin commented Nov 12, 2015

I tried to use LKJ distribution to sample correlation (and then covariance) matrix for MvNorm and, if I sample long enough, I always stumble into problem. It regularly happens with number of columns ~16, does not happen with 4 and I am not completely sure where the cutoff is.

import pymc3 as pm,numpy as np
from theano import tensor as T
import scipy as sp

def triIndex(N):
    n_elem = int(N * (N - 1) / 2) # move into function
    tri_index = np.zeros([N, N], dtype=int)
    tri_index[np.triu_indices(N, k=1)] = np.arange(n_elem)
    tri_index[np.triu_indices(N, k=1)[::-1]] = np.arange(n_elem)
    return tri_index    


import scipy as sp
N = 100
Nf = 16

mu_actual = sp.stats.uniform.rvs(-5, 10, size=Nf)
cov_actual_sqrt = sp.stats.uniform.rvs(0, 1, size=(Nf, Nf))
cov_actual = np.dot(cov_actual_sqrt.T, cov_actual_sqrt)

x = sp.stats.multivariate_normal.rvs(mu_actual, cov_actual, size=N)

#cov_actual

np.random.seed(3264602) 
tri = triIndex(Nf)
with pm.Model() as model:
     sigma = pm.Lognormal('sigma', np.zeros(Nf), np.ones(Nf), shape=Nf)


     C_triu = pm.LKJCorr('C_triu', 1, Nf) 

     C = C_triu[tri]
     C = T.fill_diagonal(C, 1)



     sigma_diag = pm.Deterministic('sigma_mat', T.nlinalg.diag(sigma))
     cov = pm.Deterministic('cov', T.nlinalg.matrix_dot(sigma_diag, C, sigma_diag))
     tau = pm.Deterministic('tau', T.nlinalg.matrix_inverse(cov))
     mu = pm.MvNormal('mu', 0, tau, shape=Nf)

     x_ = pm.MvNormal('x', mu, tau, observed=x)

     s = pm.Metropolis()
     trace = pm.sample(1000, s)

Symptoms of the problem:

from matplotlib import pyplot as plt
lp = [model.logp(trace[i]) for i in range(0,200)]
plt.plot(lp)

and you will see that logp becomes vertical.

tr16 = triIndex(Nf)
cr = trace[-1]['C_triu'][tr16]
np.fill_diagonal(cr,1)

np.linalg.eig(cr)[0]

array([-0.09574   , -0.01150316,  0.22189297,  1.98412337,  1.87778831,
    0.47466721,  0.57414119,  1.71681892,  1.58021031,  1.45534505,
    1.33115075,  1.20560476,  0.79869473,  1.0230756 ,  0.91071542,
    0.95301458])

Some of eigenvalues of reconstructed correlation matrix are negative, which, I suspect, is not correct.

model.logp() for LKJ happily calculates logp for all steps in a trace, except that for last values value of log() is ~ 10^156, and certainly not -np.inf

theano.version = '0.7.0.dev-RELEASE'
Mac OS, CUDA 7.5

$nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_15:14:46_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17

Am I doing something wrong?

@agoldin
Copy link
Contributor Author

agoldin commented Nov 12, 2015

Looks like adding

all(gt(eigh(X)[0], 0))

to

return bound() (just like in wishart) solves the problem....

@agoldin agoldin changed the title Possible bug in LJK ?? Possible bug in LKJ ?? Nov 12, 2015
@agoldin
Copy link
Contributor Author

agoldin commented Nov 13, 2015

However when I am trying to run pm.guess_scaling(), I get error

/Users/alexey/anaconda/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
1087 str(g_shape))
1088
-> 1089 input_grads = node.op.grad(inputs, new_output_grads)
1090
1091 if input_grads is None:

AttributeError: 'EighGrad' object has no attribute 'grad'

which may or may not be related to my fix.

@twiecki
Copy link
Member

twiecki commented Nov 13, 2015

CC @kiudee

@kiudee
Copy link
Contributor

kiudee commented Nov 13, 2015

I just wanted to drop in to say that I will only be able to look at the problem next week.
It is certainly possible that we need to ensure that the matrix is positive semidefinite.
It could be useful to look at the implementation in Stan.

They apply the following checks to the matrix:
https://github.com/stan-dev/math/blob/master/stan/math/prim/mat/err/check_corr_matrix.hpp

@agoldin
Copy link
Contributor Author

agoldin commented Nov 13, 2015

Looks like they are doing Cholesky decomposition and checking diagonals, this is probably more efficient than doing eigenvectors.

@twiecki
Copy link
Member

twiecki commented Nov 15, 2015

I implemented Cholesky decomposition for the Wishart here: #701 that might help.

@twiecki
Copy link
Member

twiecki commented Dec 22, 2015

@agoldin Any interest in taking a crack at this?

@agoldin
Copy link
Contributor Author

agoldin commented Jan 22, 2016

I apologize, I did not see email about your comment and I was away.

I do not understand the insides of pymc3 very well. As I wrote before

adding

all(gt(eigh(X)[0], 0))

to

return bound() (just like in wishart) solves the problem. However it breaks the derivatives of likelihood. I need to do some check for positive definitiveness that will not break gradients.

If you point me where to dig, I might try. No guarantees I will succeed though.

Thanks!

@twiecki
Copy link
Member

twiecki commented Jan 25, 2016

I'm not sure why the gradient for eigh doesn't work, I think it should exist. Perhaps asking @Theano devs might help?

@agoldin
Copy link
Contributor Author

agoldin commented Jan 26, 2016

I'll try.

@agoldin
Copy link
Contributor Author

agoldin commented Feb 17, 2016

I think I fixed it. I'll just figure out how to do pull request (gimme couple of days, I am fairly busy otherwise right now).

@twiecki
Copy link
Member

twiecki commented Feb 17, 2016

👍

fonnesbeck pushed a commit that referenced this issue Feb 23, 2016
check MCMC strays into neverland with negative eigenvalues for
covariance matrix. This is not good at all ( see
#873 ).
@twiecki twiecki closed this as completed Apr 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants