Skip to content

Working with sampled pymc3 parameter values #601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thouska opened this issue Sep 2, 2014 · 8 comments
Closed

Working with sampled pymc3 parameter values #601

thouska opened this issue Sep 2, 2014 · 8 comments

Comments

@thouska
Copy link

thouska commented Sep 2, 2014

I've started to work with pymc2.3 to calibrate some complex extern models (with the @pm.deterministic decorator) and I can say that pymc is awesome. But as i've really a lot of parameters to calibrate i wanted to test the new NUTS sampler of pymc3. Here I'm struggeling how to work with the sampled parameter values. I think the sampler of pymc2 creates floats, which is exactly what i need for my complex model. pymc3 seems to create some 'pymc3.model.FreeRV' theano.tensor things during the sampling and I do not know how to work with them. Is it somehow possible to transform the sampled values into floats?

Some example code, where i want to have float paramters x and y to feed my complex model with:

import pymc3 as pm
import numpy as np
#import complexmodel

true_data=[1,2,3]

def complexmodel(x,y):
    print type(x)
    float_x=float(x)
    float_y=float(y)
    #simulations=complexmodel(float_x,float_y)
    return simulations

with pm.Model() as model:
    x = pm.Normal('x', mu=20., sd=10)
    y = pm.Normal('y', mu=1., sd=1)   
    z = pm.Normal('z', mu=complexmodel(x,y), sd=.075, observed=true_data)

with model:
    start = pm.find_MAP()
    step = pm.NUTS()

with model:
    trace = pm.sample(3000, step, start)

pm.traceplot(trace)

This gives me the following error:

>>> runfile(.../PYMC3settings.py', wdir=r'...')
<class 'pymc3.model.FreeRV'>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\...\sitecustomize.py", line 540, in runfile
    execfile(filename, namespace)
  File ".../PYMC3settings.py", line 25, in <module>
    z = pm.Normal('z', mu=complexmodel(x,y), sd=.075, observed=true_data)
  File ".../PYMC3settings.py", line 17, in complexmodel
    float_x=float(x)
TypeError: float() argument must be a string or a number

Thanks for some hints!

PS.: I would also be very happy about an implementation of the DREAM sampler in pymc3 (#585).
I could not bring the multichain_mcmc package to work with pymc2.3.

@fonnesbeck
Copy link
Member

It looks like you are trying to cast a PyMC object to a float, which will not work. Are you trying to come up with some sort of mixture of the two stochastics? If so, you should be looking at using a Deterministic object instead.

@thouska
Copy link
Author

thouska commented Sep 3, 2014

Thank you for your fast answer!
I want to run my model (a geoscience model) with the stochastics x and y. Those stochastics tell the model how to be set the parameters in the model. My model takes just float values, so I have to get get sampled floats for x and y.
I guess this is not a sort of mixture oft the stochastics but the Deterministic object seems to be a good idea.
In pymc2.3 i'm doing something like this to analyse my model:

import pymc2 as pm
import numpy as np

def complexmodel(x,y):
    # just an stupid example, but i actually have to check the sampled variables first, 
    # to prevent that the sampled stochastics run out of the physical boundaries of my complex model 
    # (this would result in an error, stopping the whole run)

    print float(x)    #Here it works to get float values
    if x<0:
        return [-np.inf]*3
    if y<0:
        return [-np.inf]*3
    else:
        return [x,y,3]

class model(object):
    true_data=[1,2,3]

    x = pm.Normal("x", mu=20, tau=10**-2)
    y = pm.Normal("y", mu=1, tau=1**-2)

    @pm.deterministic
    def testmodel(x=x,y=y):   
        return complexmodel(x,y)

    like = pm.Normal("like", mu=testmodel, tau=0.075**-2, value=true_data, observed=True)
    modelvars=dict(x=x,y=y,testmodel=testmodel,like=like)


M = pm.MCMC(model)
M.sample(iter=1000,burn=500)
plot(M.trace('x')[:])
plot(M.trace('y')[:])

This code works fine. And i can get float values from x and y. How do I translate this code into pymc3?
If I use your proposed deterministic object in pymc3, would it give me x and y in a way that they can be transformed into floats?
The the deterministic class in pymc 3 is not documented yet, so I'm not sure how to use it correct.
I tried something out:

import pymc3 as pm
import numpy as np


true_data=[1,2,3]

def complexmodel(x,y):
    print float(x)
    if x<0:
        return [-np.inf,y,3]
    if y<0:
        return [x,-np.inf,3]
    else:    
        return [x,y,3]

with pm.Model() as model:
    x = pm.Normal('x', mu=20., sd=10)
    y = pm.Normal('y', mu=1., sd=1)
    testmodel = pm.Deterministic('testmodel',[x,y], model=complexmodel)
    z = pm.Normal('z', mu=testmodel, sd=0.075, observed=true_data)

with model:
    start = pm.find_MAP()
    step = pm.NUTS()

with model:
    trace = pm.sample(3000, step, start)

trace[y].shape
pm.traceplot(trace)

But this gives me an error:

testmodel = pm.Deterministic('testmodel',[x,y], model=complexmodel)
  File "...\pymc3\model.py", line 363, in Deterministic
    var.name = name
AttributeError: 'list' object has no attribute 'name'

So my main point is, how to translate the pymc2.3 example into a working pymc3 style where I can analyse my complex model with floats of the stochastics x and y and many more stochastics?

@thouska
Copy link
Author

thouska commented Sep 4, 2014

testmodel = pm.Deterministic('testmodel',complexmodel(x,y), model=None) seems to work, but I still do not know how to access the values of x and y during the sample to make a float out of them.
Something like x.tag.test_value which returns in my case 20.0 would help.
Reading a bit about theano, I found that x.eval() should give me the sampled value of x, but instead I get another error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 540, in runfile
    execfile(filename, namespace)
  File ".../PYMC3settings.py", line 36, in <module>
    testmodel = pm.Deterministic('testmodel',complexmodel(x,y), model=None)
  File ".../PYMC3settings.py", line 25, in complexmodel
    print x.eval()
  File "...\lib\site-packages\theano-0.6.0-py2.7.egg\theano\gof\graph.py", line 420, in eval
    self._fn = theano.function(self._fn_inputs, self)
  File "...\lib\site-packages\theano-0.6.0-py2.7.egg\theano\compile\function.py", line 223, in function
    profile=profile)
  File "...\lib\site-packages\theano-0.6.0-py2.7.egg\theano\compile\pfunc.py", line 512, in pfunc
    on_unused_input=on_unused_input)
  File "...\lib\site-packages\theano-0.6.0-py2.7.egg\theano\compile\function_module.py", line 1311, in orig_function
    on_unused_input=on_unused_input).create(
  File "...\lib\site-packages\theano-0.6.0-py2.7.egg\theano\compile\function_module.py", line 1007, in __init__
    fgraph, additional_outputs = std_fgraph(inputs, outputs, accept_inplace)
  File "...\lib\site-packages\theano-0.6.0-py2.7.egg\theano\compile\function_module.py", line 132, in std_fgraph
    fgraph = gof.fg.FunctionGraph(orig_inputs, orig_outputs)
  File "...\lib\site-packages\theano-0.6.0-py2.7.egg\theano\gof\fg.py", line 128, in __init__
    self.__import_r__(outputs, reason="init")
  File "...lib\site-packages\theano-0.6.0-py2.7.egg\theano\gof\fg.py", line 256, in __import_r__
    raise MissingInputError("Undeclared input", r)
theano.gof.fg.MissingInputError: ('Undeclared input', x)

Isn´t there any possibility to access the values of the sampled stochastics?

@fonnesbeck
Copy link
Member

You aren't using Deterministic correctly. This is just a wrapper for a simple deterministic transformation of PyMC variables, such as:

x = Normal('x', 0, 1)
x2 = Deterministic('x2', x**2)

The model argument should not be used when inside of a model context. Your complexmodel function should return a PyMC (Theano) variable,

Then, it the return value should be passed to the deterministic, not the function itself:

testmodel = pm.Deterministic('testmodel', complexmodel(x,y))

Not surprised that you are confused, as we have not written up the documentation yet (but we are working towards it).

@thouska
Copy link
Author

thouska commented Sep 8, 2014

Thanks to @fonnesbeck! Now i´ve understand what this pm.deterministic does. Unfortunatelly it´s not what i searched for. My complexmodel does return a list of simlated values and not a theano variable.

What i search for is an algorithm which fits my external complexmodel with some parameters (here they are called stochastics) to observed data. PyMC2.3 does this job very good and I have the feeling that PyMC3 could do it even better. But I don´t know how to do it.

To show what i exactly want, I made my example from above more explicite:

values=[]
def complexmodel(x,y):
    #Call an external model here, e.g. (https://github.com/sahg/https://github.com/sahg/PyTOPKAPI)
    #The PyTOPKAPI model can not work with Theano variables and needs x and y floats, so   
    #something has to happen here, to transform the theano variables x and y to floats.
    Discharge = PyTOPKAPI.run('model-simulation.ini',x,y) 
    values.append(x,y) # Just to see what is going on
    return Discharge

with pm.Model() as model:
    x = pm.Normal('x', mu=20, sd=10)
    y = pm.Normal('y', mu=1, sd=1)
    z = pm.Normal('z', mu=complexmodel(x,y), sd=0.075, observed=true_data)

I would expect that my complexmodel function gets called in every step of the sampling (s.o.). Here i´m not sure if this is really happening. To check this, i store the stochastics x and y in a list values in the complexmodel function. But after the sampling, i found just one x and one y in the values list, instead of 3000.
To calculate the likelihood z the function needs to be called 3000 times, doesn´t it?
Why contains my list valuesonly one x and one y instance?

@fonnesbeck
Copy link
Member

Check out the approach used in #507, which allows you to make arbitrary deterministics compatible with PyMC 3's Theano internals. In particular, see the @theano.compile.ops.as_op decorator.

@thouska
Copy link
Author

thouska commented Sep 18, 2014

Thanks for this great hint! My code is working now:

import pymc3 as pm
import numpy as np
import theano
import theano.tensor as t

true_data=[1,2]

values=[]

@theano.compile.ops.as_op(itypes=[t.dscalar, t.dscalar],otypes=[t.dvector])
def complexmodel(x,y):
    print float(x)
    return np.array([x,y])

with pm.Model() as model:
    x = pm.Exponential('x', lam=1)
    y = pm.Exponential('y', lam=1)
    complexmodel=complexmodel(x,y)
    z = pm.Poisson('z', complexmodel, observed=true_data)

with model:
    #start = pm.find_MAP()
    start ={'x':1,'y':2.5}
    #step =pm.NUTS()    
    step = pm.Metropolis()

with model:
    trace = pm.sample(3000, step, start)

trace[y].shape
pm.traceplot(trace)

Do you plan, to bring gradient based samplers to work with this, or maybe a more straightforward, trick? When I use for example your NUTS sampler as a step method I get an error:

> 0.693147182465
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 540, in runfile
    execfile(filename, namespace)
  File ".../PYMC3settings.py", line 137, in <module>
    step =pm.NUTS()    
  File "...\site-packages\pymc3\step_methods\nuts.py", line 59, in __init__
    scaling = guess_scaling(Point(scaling, model=model), model=model, vars = vars)
  File "...\site-packages\pymc3\tuning\scaling.py", line 79, in guess_scaling
    h = find_hessian_diag(point, vars, model=model)
  File "...\site-packages\pymc3\tuning\scaling.py", line 74, in find_hessian_diag
    H = model.fastfn(hessian_diag(model.logpt, vars))
  File "...\site-packages\pymc3\memoize.py", line 14, in memoizer
    cache[key] = obj(*args, **kwargs)
  File "...\site-packages\pymc3\theanof.py", line 94, in hessian_diag
    return -t.concatenate([hessian_diag1(f, v) for v in vars], axis=0)
  File "...\site-packages\pymc3\theanof.py", line 80, in hessian_diag1
    g = gradient1(f, v)
  File "...\site-packages\pymc3\theanof.py", line 43, in gradient1
    return t.flatten(t.grad(f, v, disconnected_inputs='warn'))
  File "...\site-packages\theano-0.6.0-py2.7.egg\theano\gradient.py", line 529, in grad
    grad_dict, wrt, cost_name)
  File "...\site-packages\theano-0.6.0-py2.7.egg\theano\gradient.py", line 1213, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "...\site-packages\theano-0.6.0-py2.7.egg\theano\gradient.py", line 1173, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "...\site-packages\theano-0.6.0-py2.7.egg\theano\gradient.py", line 1034, in access_term_cache
    input_grads = node.op.grad(inputs, new_output_grads)
AttributeError: 'FromFunctionOp' object has no attribute 'grad'

@twiecki
Copy link
Member

twiecki commented Sep 18, 2014

The only way to get the automatic gradient computation is by expressing your density in terms of theano operators. as_op creates a blackbox function for which autodiff will not work so there is no way I know of (except numerical differentiation) to make this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants