DisconnectedType Error when computing gradient of stacklists #379

jangevaare · 2013-11-06T01:05:20Z

Hello. I have a model here that will run fine with MH sampling but encounters errors with NUTS and standard HMC sampling methods. This example is probably unique in its use of theano.tensor.stacklists, which may somehow relate to the error. My use of a constant in ex_tau may also be of relevance.

Here's the model:

import pymc as pm
import numpy as np
from theano.tensor import stacklists, scalars, matrices, sqrt, constant, dot
from theano.sandbox.linalg.ops import matrix_inverse

cov=np.array([[2,1],[1,3]])
mean=([2,7])
tau_choleskyroot=np.linalg.cholesky(np.linalg.inv(cov))

N=1
z_data=np.ndarray.flatten(np.random.multivariate_normal(mean, cov, N))

def ex_tau(a11_sqd, a12, a22_sqd):
    ex_A=stacklists([[sqrt(a11_sqd), a12], [constant(0, dtype='float64'), sqrt(a22_sqd)]])
    return ex_A.T.dot(ex_A)

with pm.Model() as model:    
    a11_sqd=pm.Gamma('a11_sqd', alpha=2, beta=2)
    a22_sqd=pm.Gamma('a22_sqd', alpha=2, beta=2)
    a12=pm.Normal('a12', mu=0, tau=1)
    z=pm.MvNormal('z', mu=mean, Tau=ex_tau(a11_sqd, a12, a22_sqd), shape=2, observed=z_data)
    start = pm.find_MAP()

with model:
    step = pm.NUTS()
    trace = pm.sample(3000, step, start)

Thank you for taking a look at this! This error seems similar to the one dealt with #336, but I am definitely using the latest versions of Theano (and pymc). Here are the full details of the error when using NUTS:

(edited since I accidentally included the error twice)

---------------------------------------------------------------------------
AsTensorError                             Traceback (most recent call last)
<ipython-input-1-c16baa5bbcef> in <module>()
     23 
     24 with model:
---> 25     step = pm.NUTS()
     26     trace = pm.sample(3000, step, start)

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/step_methods/nuts.pyc in __init__(self, vars, scaling, step_scale, is_cov, state, Emax, target_accept, gamma, k, t0, model)
     57 
     58         if isinstance(scaling, dict):
---> 59             scaling = guess_scaling(Point(scaling, model=model), model=model)
     60 
     61 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/tuning/scaling.pyc in guess_scaling(point, model)
     77 def guess_scaling(point, model=None):
     78     model = modelcontext(model)
---> 79     h = find_hessian_diag(point, model=model)
     80     return adjust_scaling(h)
     81 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/tuning/scaling.pyc in find_hessian_diag(point, vars, model)
     72     """
     73     model=modelcontext(model)
---> 74     H = compilef(hessian_diag(model.logp, vars))
     75     return H(Point(point, model=model))
     76 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/memoize.pyc in memoizer(*args, **kwargs)
     12 
     13         if key not in cache:
---> 14             cache[key] = obj(*args, **kwargs)
     15 
     16         return cache[key]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in hessian_diag(f, vars)
     92         vars = cont_inputs(f)
     93 
---> 94     return -t.concatenate([hessian_diag1(f, v) for v in vars], axis=0)
     95 
     96 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in hessian_diag1(f, v)
     84         return gradient1(g[i], v)[i]
     85 
---> 86     return theano.map(hess_ii, idx)[0]
     87 
     88 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/scan_module/scan_views.pyc in map(fn, sequences, non_sequences, truncate_gradient, go_backwards, mode, name)
     66                      go_backwards=go_backwards,
     67                      mode=mode,
---> 68                      name=name)
     69 
     70 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/scan_module/scan.pyc in scan(fn, sequences, outputs_info, non_sequences, n_steps, truncate_gradient, go_backwards, mode, name, profile)
    730     # and outputs that needs to be separated
    731 
--> 732     condition, outputs, updates = scan_utils.get_updates_and_outputs(fn(*args))
    733     if condition is not None:
    734         as_while = True

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in hess_ii(i)
     82 
     83     def hess_ii(i):
---> 84         return gradient1(g[i], v)[i]
     85 
     86     return theano.map(hess_ii, idx)[0]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in gradient1(f, v)
     41 def gradient1(f, v):
     42     """flat gradient of f wrt v"""
---> 43     return t.flatten(t.grad(f, v, disconnected_inputs='warn'))
     44 
     45 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected)
    526 
    527     rval = _populate_grad_dict(var_to_app_to_idx,
--> 528             grad_dict, wrt, cost_name)
    529 
    530     for i in xrange(len(rval)):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
   1101         return grad_dict[var]
   1102 
-> 1103     rval = [access_grad_cache(elem) for elem in wrt]
   1104 
   1105     return rval

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    922                                 str(g_shape))
    923 
--> 924                 input_grads = node.op.grad(inputs, new_output_grads)
    925 
    926                 if input_grads is None:

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/basic.pyc in grad(self, inputs, g_outputs)
   3208         """Join the gradients along the axis that was used to split x."""
   3209         _, axis, n = inputs
-> 3210         return [join(axis, *g_outputs),
   3211                 grad_undefined(self, 1, axis),
   3212                 grad_undefined(self, 2, n)]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gof/op.pyc in __call__(self, *inputs, **kwargs)
    397         """
    398         return_list = kwargs.pop('return_list', False)
--> 399         node = self.make_node(*inputs, **kwargs)
    400         if self.add_stack_trace_on_call:
    401             self.add_tag_trace(node)

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/basic.pyc in make_node(self, *axis_and_tensors)
   3379         if not tensors:
   3380             raise ValueError('Cannot join an empty list of tensors')
-> 3381         as_tensor_variable_args = [as_tensor_variable(x) for x in tensors]
   3382 
   3383         dtypes = [x.type.dtype for x in as_tensor_variable_args]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/basic.pyc in as_tensor_variable(x, name, ndim)
    157         if not isinstance(x.type, TensorType):
    158             raise AsTensorError(
--> 159                 "Variable type field must be a TensorType.", x, x.type)
    160 
    161         if ndim is None:

AsTensorError: ('Variable type field must be a TensorType.', <DisconnectedType>, <theano.gradient.DisconnectedType object at 0x11c4b0590>)

The text was updated successfully, but these errors were encountered:

twiecki · 2013-11-06T02:00:09Z

Looks indeed very similar to #336 and Theano/Theano#1534.

I suppose hierarchical.py runs fine for you then?

Maybe ask the Theano guys if @jsalvatier doesn't know?

jangevaare · 2013-11-06T02:26:40Z

Yes, hierarchical.py, runs without a hitch.

jsalvatier · 2013-11-06T05:41:58Z

I think this may be related to taking derivatives of stacklist:

from theano import *
from theano.tensor import * 
from pymc.theanof import hessian_diag1

a = dscalar('a')
b = sum(stacklists([[a, 1], [2,1]]))
g = hessian_diag1(b, a)

f = function([a], [g])
/home/john/Documents/workspace/Theano/theano/gradient.py:512: UserWarning: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: a
  handle_disconnected(elem)
/home/john/Documents/workspace/Theano/theano/gradient.py:532: UserWarning: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: <DisconnectedType>
  handle_disconnected(rval[i])

I think asking the theano list is a good idea. I will try to dig into this a bit.

jsalvatier · 2013-11-24T01:32:35Z

I finally got an actual reproduction:

from theano.tensor import *

def ex_tau(a, b):
    ex_A=stacklists([a, b])
    return ex_A.T.dot(ex_A)

a=dvector('a')
b=dvector('b')

hessian(sum(ex_tau(a, b)), [a,b])

twiecki · 2013-12-04T16:00:33Z

@jangevaa Can you try if the model works for you now with most recent theano?

jangevaare · 2013-12-04T16:50:08Z

@twiecki Works great! NUTS seems to be having some problems though (no errors triggered). Take a look at these traceplots for the different step methods.

Metropolis

HMC

NUTS

And here's the code...

import pymc as pm
import numpy as np
from theano.tensor import stacklists, scalars, matrices, sqrt, constant, dot
from theano.sandbox.linalg.ops import matrix_inverse

cov=np.array([[2,1],[1,3]])
mean=([2,7])
tau_choleskyroot=np.linalg.cholesky(np.linalg.inv(cov))

N=1
z_data=np.ndarray.flatten(np.random.multivariate_normal(mean, cov, N))

def ex_tau(a11_sqd, a12, a22_sqd):
    ex_A=stacklists([[sqrt(a11_sqd), a12], [constant(0, dtype='float64'), sqrt(a22_sqd)]])
    return ex_A.T.dot(ex_A)

with pm.Model() as model:    
    a11_sqd=pm.Gamma('a11_sqd', alpha=2, beta=2)
    a22_sqd=pm.Gamma('a22_sqd', alpha=2, beta=2)
    a12=pm.Normal('a12', mu=0, tau=1)
    z=pm.MvNormal('z', mu=mean, tau=ex_tau(a11_sqd, a12, a22_sqd), shape=2, observed=z_data)

with model:    
    start = pm.find_MAP()

with model:
    stepNUTS = pm.NUTS()
    traceNUTS = pm.sample(3000, stepNUTS, start)

with model:
    stepHMC = pm.HamiltonianMC()
    traceHMC = pm.sample(3000, stepHMC, start)

with model: 
    stepMH = pm.Metropolis()
    traceMH = pm.sample(3000, stepMH, start)

pm.traceplot(traceNUTS);
pm.traceplot(traceHMC);
pm.traceplot(traceMH);

jangevaare · 2013-12-04T16:57:35Z

I will close this and open up another issue though, I feel they're unrelated.

jangevaare mentioned this issue Nov 6, 2013

Disconnected Type Error when finding derivative of stacklists Theano/Theano#1589

Closed

jangevaare mentioned this issue Nov 21, 2013

Symmetry, non-negative-definiteness not enforced for Wishart sampling #395

Closed

jangevaare closed this as completed Dec 4, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DisconnectedType Error when computing gradient of stacklists #379

DisconnectedType Error when computing gradient of stacklists #379

jangevaare commented Nov 6, 2013

twiecki commented Nov 6, 2013

Uh oh!

jangevaare commented Nov 6, 2013

Uh oh!

jsalvatier commented Nov 6, 2013

Uh oh!

jsalvatier commented Nov 24, 2013

Uh oh!

twiecki commented Dec 4, 2013

Uh oh!

jangevaare commented Dec 4, 2013

Uh oh!

jangevaare commented Dec 4, 2013

Uh oh!

Uh oh!

DisconnectedType Error when computing gradient of stacklists #379

DisconnectedType Error when computing gradient of stacklists #379

Comments

jangevaare commented Nov 6, 2013

twiecki commented Nov 6, 2013

Uh oh!

jangevaare commented Nov 6, 2013

Uh oh!

jsalvatier commented Nov 6, 2013

Uh oh!

jsalvatier commented Nov 24, 2013

Uh oh!

twiecki commented Dec 4, 2013

Uh oh!

jangevaare commented Dec 4, 2013

Uh oh!

jangevaare commented Dec 4, 2013

Uh oh!