Skip to content

DisconnectedType Error when computing gradient of stacklists #379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jangevaare opened this issue Nov 6, 2013 · 7 comments
Closed

DisconnectedType Error when computing gradient of stacklists #379

jangevaare opened this issue Nov 6, 2013 · 7 comments
Labels

Comments

@jangevaare
Copy link

Hello. I have a model here that will run fine with MH sampling but encounters errors with NUTS and standard HMC sampling methods. This example is probably unique in its use of theano.tensor.stacklists, which may somehow relate to the error. My use of a constant in ex_tau may also be of relevance.

Here's the model:

import pymc as pm
import numpy as np
from theano.tensor import stacklists, scalars, matrices, sqrt, constant, dot
from theano.sandbox.linalg.ops import matrix_inverse

cov=np.array([[2,1],[1,3]])
mean=([2,7])
tau_choleskyroot=np.linalg.cholesky(np.linalg.inv(cov))

N=1
z_data=np.ndarray.flatten(np.random.multivariate_normal(mean, cov, N))

def ex_tau(a11_sqd, a12, a22_sqd):
    ex_A=stacklists([[sqrt(a11_sqd), a12], [constant(0, dtype='float64'), sqrt(a22_sqd)]])
    return ex_A.T.dot(ex_A)

with pm.Model() as model:    
    a11_sqd=pm.Gamma('a11_sqd', alpha=2, beta=2)
    a22_sqd=pm.Gamma('a22_sqd', alpha=2, beta=2)
    a12=pm.Normal('a12', mu=0, tau=1)
    z=pm.MvNormal('z', mu=mean, Tau=ex_tau(a11_sqd, a12, a22_sqd), shape=2, observed=z_data)
    start = pm.find_MAP()

with model:
    step = pm.NUTS()
    trace = pm.sample(3000, step, start)

Thank you for taking a look at this! This error seems similar to the one dealt with #336, but I am definitely using the latest versions of Theano (and pymc). Here are the full details of the error when using NUTS:

(edited since I accidentally included the error twice)

---------------------------------------------------------------------------
AsTensorError                             Traceback (most recent call last)
<ipython-input-1-c16baa5bbcef> in <module>()
     23 
     24 with model:
---> 25     step = pm.NUTS()
     26     trace = pm.sample(3000, step, start)

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/step_methods/nuts.pyc in __init__(self, vars, scaling, step_scale, is_cov, state, Emax, target_accept, gamma, k, t0, model)
     57 
     58         if isinstance(scaling, dict):
---> 59             scaling = guess_scaling(Point(scaling, model=model), model=model)
     60 
     61 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/tuning/scaling.pyc in guess_scaling(point, model)
     77 def guess_scaling(point, model=None):
     78     model = modelcontext(model)
---> 79     h = find_hessian_diag(point, model=model)
     80     return adjust_scaling(h)
     81 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/tuning/scaling.pyc in find_hessian_diag(point, vars, model)
     72     """
     73     model=modelcontext(model)
---> 74     H = compilef(hessian_diag(model.logp, vars))
     75     return H(Point(point, model=model))
     76 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/memoize.pyc in memoizer(*args, **kwargs)
     12 
     13         if key not in cache:
---> 14             cache[key] = obj(*args, **kwargs)
     15 
     16         return cache[key]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in hessian_diag(f, vars)
     92         vars = cont_inputs(f)
     93 
---> 94     return -t.concatenate([hessian_diag1(f, v) for v in vars], axis=0)
     95 
     96 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in hessian_diag1(f, v)
     84         return gradient1(g[i], v)[i]
     85 
---> 86     return theano.map(hess_ii, idx)[0]
     87 
     88 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/scan_module/scan_views.pyc in map(fn, sequences, non_sequences, truncate_gradient, go_backwards, mode, name)
     66                      go_backwards=go_backwards,
     67                      mode=mode,
---> 68                      name=name)
     69 
     70 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/scan_module/scan.pyc in scan(fn, sequences, outputs_info, non_sequences, n_steps, truncate_gradient, go_backwards, mode, name, profile)
    730     # and outputs that needs to be separated
    731 
--> 732     condition, outputs, updates = scan_utils.get_updates_and_outputs(fn(*args))
    733     if condition is not None:
    734         as_while = True

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in hess_ii(i)
     82 
     83     def hess_ii(i):
---> 84         return gradient1(g[i], v)[i]
     85 
     86     return theano.map(hess_ii, idx)[0]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pymc/theanof.pyc in gradient1(f, v)
     41 def gradient1(f, v):
     42     """flat gradient of f wrt v"""
---> 43     return t.flatten(t.grad(f, v, disconnected_inputs='warn'))
     44 
     45 

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected)
    526 
    527     rval = _populate_grad_dict(var_to_app_to_idx,
--> 528             grad_dict, wrt, cost_name)
    529 
    530     for i in xrange(len(rval)):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
   1101         return grad_dict[var]
   1102 
-> 1103     rval = [access_grad_cache(elem) for elem in wrt]
   1104 
   1105     return rval

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    783             inputs = node.inputs
    784 
--> 785             output_grads = [access_grad_cache(var) for var in node.outputs]
    786 
    787             # list of bools indicating if each output is connected to the cost

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_grad_cache(var)
   1061                     for idx in node_to_idx[node]:
   1062 
-> 1063                         term = access_term_cache(node)[idx]
   1064 
   1065                         if not isinstance(term, gof.Variable):

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gradient.pyc in access_term_cache(node)
    922                                 str(g_shape))
    923 
--> 924                 input_grads = node.op.grad(inputs, new_output_grads)
    925 
    926                 if input_grads is None:

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/basic.pyc in grad(self, inputs, g_outputs)
   3208         """Join the gradients along the axis that was used to split x."""
   3209         _, axis, n = inputs
-> 3210         return [join(axis, *g_outputs),
   3211                 grad_undefined(self, 1, axis),
   3212                 grad_undefined(self, 2, n)]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/gof/op.pyc in __call__(self, *inputs, **kwargs)
    397         """
    398         return_list = kwargs.pop('return_list', False)
--> 399         node = self.make_node(*inputs, **kwargs)
    400         if self.add_stack_trace_on_call:
    401             self.add_tag_trace(node)

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/basic.pyc in make_node(self, *axis_and_tensors)
   3379         if not tensors:
   3380             raise ValueError('Cannot join an empty list of tensors')
-> 3381         as_tensor_variable_args = [as_tensor_variable(x) for x in tensors]
   3382 
   3383         dtypes = [x.type.dtype for x in as_tensor_variable_args]

/Users/justin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/theano/tensor/basic.pyc in as_tensor_variable(x, name, ndim)
    157         if not isinstance(x.type, TensorType):
    158             raise AsTensorError(
--> 159                 "Variable type field must be a TensorType.", x, x.type)
    160 
    161         if ndim is None:

AsTensorError: ('Variable type field must be a TensorType.', <DisconnectedType>, <theano.gradient.DisconnectedType object at 0x11c4b0590>) 
@twiecki
Copy link
Member

twiecki commented Nov 6, 2013

Looks indeed very similar to #336 and Theano/Theano#1534.

I suppose hierarchical.py runs fine for you then?

Maybe ask the Theano guys if @jsalvatier doesn't know?

@jangevaare
Copy link
Author

Yes, hierarchical.py, runs without a hitch.

@jsalvatier
Copy link
Member

I think this may be related to taking derivatives of stacklist:

from theano import *
from theano.tensor import * 
from pymc.theanof import hessian_diag1

a = dscalar('a')
b = sum(stacklists([[a, 1], [2,1]]))
g = hessian_diag1(b, a)

f = function([a], [g])
/home/john/Documents/workspace/Theano/theano/gradient.py:512: UserWarning: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: a
  handle_disconnected(elem)
/home/john/Documents/workspace/Theano/theano/gradient.py:532: UserWarning: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: <DisconnectedType>
  handle_disconnected(rval[i])

I think asking the theano list is a good idea. I will try to dig into this a bit.

@jsalvatier
Copy link
Member

I finally got an actual reproduction:

from theano.tensor import *

def ex_tau(a, b):
    ex_A=stacklists([a, b])
    return ex_A.T.dot(ex_A)

a=dvector('a')
b=dvector('b')

hessian(sum(ex_tau(a, b)), [a,b])

@twiecki
Copy link
Member

twiecki commented Dec 4, 2013

@jangevaa Can you try if the model works for you now with most recent theano?

@jangevaare
Copy link
Author

@twiecki Works great! NUTS seems to be having some problems though (no errors triggered). Take a look at these traceplots for the different step methods.

Metropolis
mh_stacklists
HMC
hmc_stacklists
NUTS
nuts_stacklists

And here's the code...

import pymc as pm
import numpy as np
from theano.tensor import stacklists, scalars, matrices, sqrt, constant, dot
from theano.sandbox.linalg.ops import matrix_inverse

cov=np.array([[2,1],[1,3]])
mean=([2,7])
tau_choleskyroot=np.linalg.cholesky(np.linalg.inv(cov))

N=1
z_data=np.ndarray.flatten(np.random.multivariate_normal(mean, cov, N))

def ex_tau(a11_sqd, a12, a22_sqd):
    ex_A=stacklists([[sqrt(a11_sqd), a12], [constant(0, dtype='float64'), sqrt(a22_sqd)]])
    return ex_A.T.dot(ex_A)

with pm.Model() as model:    
    a11_sqd=pm.Gamma('a11_sqd', alpha=2, beta=2)
    a22_sqd=pm.Gamma('a22_sqd', alpha=2, beta=2)
    a12=pm.Normal('a12', mu=0, tau=1)
    z=pm.MvNormal('z', mu=mean, tau=ex_tau(a11_sqd, a12, a22_sqd), shape=2, observed=z_data)

with model:    
    start = pm.find_MAP()

with model:
    stepNUTS = pm.NUTS()
    traceNUTS = pm.sample(3000, stepNUTS, start)

with model:
    stepHMC = pm.HamiltonianMC()
    traceHMC = pm.sample(3000, stepHMC, start)

with model: 
    stepMH = pm.Metropolis()
    traceMH = pm.sample(3000, stepMH, start)

pm.traceplot(traceNUTS);
pm.traceplot(traceHMC);
pm.traceplot(traceMH);

@jangevaare
Copy link
Author

I will close this and open up another issue though, I feel they're unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants