Skip to content

Can't set symbolic initval #4911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaelosthege opened this issue Aug 7, 2021 · 15 comments · Fixed by #4912
Closed

Can't set symbolic initval #4911

michaelosthege opened this issue Aug 7, 2021 · 15 comments · Fixed by #4912
Labels

Comments

@michaelosthege
Copy link
Member

Description of your problem

Found by @ricardoV94, also see #4867 (comment)

Please provide a minimal, self-contained, and reproducible example.

import pymc3 as pm
with pm.Model() as m:
    x = pm.Normal('x')
    y = pm.Normal('y', x, 1, initval=x)

Please provide the full traceback.

Complete error traceback
TypeError: Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-54-6c83ccbaecb6> in <module>
      2 with pm.Model() as m:
      3     x = pm.Normal('x')
----> 4     y = pm.Normal('y', x, 1, initval=x)

c:\...\pymc3\distributions\distribution.py in __new__(cls, name, rng, dims, initval, observed, total_size, transform, *args, **kwargs)
    222         if initval is not None:
    223             # Assigning the testval earlier causes trouble because the RV may not be created with the final shape already.
--> 224             rv_out.tag.test_value = initval
    225 
    226         rv_out = model.register_rv(

~\...\aesara\graph\utils.py in __setattr__(self, attr, obj)
    265 
    266         if getattr(self, "attr", None) == attr:
--> 267             obj = self.attr_filter(obj)
    268 
    269         return object.__setattr__(self, attr, obj)

~\...\aesara\tensor\type.py in filter(self, data, strict, allow_downcast)
    112         # input (typical mistake, especially with shared variables).
    113         if isinstance(data, Variable):
--> 114             raise TypeError(
    115                 "Expected an array-like object, but found a Variable: "
    116                 "maybe you are trying to call a function on a (possibly "

TypeError: Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?

Please provide any additional information below.

Versions and main components

  • PyMC3 Version: main
  • Aesara: 2.1.2
@brandonwillard
Copy link
Contributor

Initial values are supposed to be non-symbolic by definition.

@michaelosthege
Copy link
Member Author

Bad phrasing on my part then.
What's failing is to automatically convert variables coming in via the initval kwarg into numeric initial values.

@brandonwillard
Copy link
Contributor

brandonwillard commented Aug 7, 2021

Bad phrasing on my part then.
What's failing is to automatically convert variables coming in via the initval kwarg into numeric initial values.

That's still not a failure of any sort. The interface was neither designed nor intended to do that. What you're describing is a very specific sort of additional functionality.

@fonnesbeck
Copy link
Member

@michaelosthege wouldn't you just set the x and y to the same (non-symbolic) initval? What's the benefit of symbolic initvals?

@michaelosthege
Copy link
Member Author

We continued the discussion on Slack, so let me elaborate:

I'm working towards an API that allows us to bring back moment-based initial values (#4752) where they give more robustness than random draws.
In v3 those were mostly based on moments such as mean/mode/... and thereby they'll be symbolic at first.

So at some point between their conception in a distribution class such as HalfFlat/StudentT/... and the Model.initial_values dict they'll have to be evaluated into a numeric value.

#4867 currently implements that by changing two lines in Model.set_initval to utilize existing code.
This implementation however changes the signature of Distribution.__new__/Model.register_rv/set_initval such that Variable is a legal type for the initval kwarg.
Some concerns were expressed about allowing initval: Variable at the user-level API.

Personally I don't see it as a problem though, because I can picture valid use cases, for example with correlated parameters.
Thinking about it, I now wonder if we can even stick with numeric initial values, when at the same time the corresponding RVs can change their shape 🤔

@brandonwillard
Copy link
Contributor

brandonwillard commented Aug 8, 2021

I'm working towards an API that allows us to bring back moment-based initial values (#4752) where they give more robustness than random draws.
In v3 those were mostly based on moments such as mean/mode/... and thereby they'll be symbolic at first.

That functionality was based on the defaults attribute in Distribution and was specified at the class-level via a string-based constructor argument (e.g. "mean", "median", and "mode" for Continuous).

In other words, that functionality and interface has no clear connection with this feature request. If one wanted to re-implement that functionality, there's no reason to believe that this issue's feature is necessary, or even relevant.

Here's an example based on @fonnesbeck's implicit suggestion:

import pymc3 as pm


def get_initial_value(x):
    model = pm.modelcontext(None)
    return model.initial_values[m.rvs_to_values[x]]


with pm.Model() as m:
    x = pm.Normal('x')
    y = pm.Normal('y', x, 1, initval=get_initial_value(x))

Using this approach, the interface stays simple. Also, this hypothetical get_initial_value function can be extended to include more sophisticated things, and—the best part—this functionality doesn't need to exist within (or operate through) unrelated core classes like Distribution!

So at some point between their conception in a distribution class such as HalfFlat/StudentT/... and the Model.initial_values dict they'll have to be evaluated into a numeric value.

Let's not forget that initial values are only relevant to models and the objects that model them (i.e. Models). Distribution.__new__ only handles initial values because it doubles as an automatic means of constructing a Model.

We do not want Model-specific logic/handling to bleed into Distribution.__new__, and your approach goes in exactly that direction.

#4867 currently implements that by changing two lines in Model.set_initval to utilize existing code.
This implementation however changes the signature of Distribution.__new__/Model.register_rv/set_initval such that Variable is a legal type for the initval kwarg.
Some concerns were expressed about allowing initval: Variable at the user-level API.

First off, what is the relevance/point of this issue if you've already made the changes necessary for this functionality in that diff? The changes you made in that diff were to the Model.set_initval method and have nothing to do with the Distribution.__new__ interface that defines this issue. This modified Model.set_initval can be used internally to accomplish your goals without any modifications to the meaning/intent of initval in Distribution.__new__ (or anywhere else for that matter).

Also, it looks like your "two line" change will actually involve quite a bit more than just two lines, because now callers will have to deal with the exceptions raised by aesara.function when—for example—an initval graph doesn't have initial values for all its inputs or contains objects that aren't tracked by the Model. These are the kinds of things that need to be considered more carefully when exposing internal functionality/interfaces.

N.B: The code your changes are using in Model.set_initval is exactly the code one would use to implement the get_initial_value function proposed above. For that matter, Model.set_initval itself could be used to accomplish the same thing. Better yet, that evaluation logic could be factored out of Model.set_initval and made available as its own method.

Personally I don't see it as a problem though, because I can picture valid use cases, for example with correlated parameters.

Again, do those use cases necessitate—or even imply—the changes you're proposing? The example above demonstrates how one could easily compute and set a non-symbolic initial value at the user-level without internal changes, and I'm sure there are other approaches that are just as simple or simpler (e.g. using Model.set_initval directly).
Actually, allowing Distribution.__new__'s initval argument to take symbolic values would only be a minor shortcut for the same approach—at best.

Thinking about it, I now wonder if we can even stick with numeric initial values, when at the same time the corresponding RVs can change their shape

What? Initial values are just that; they're a single set of values that are valid for the graphs produced by a Model. Breaking that consistency would involve altering a model retroactively/in-place or changing other initial values, and that would require recomputation of all the affected initial values—by necessity.

michaelosthege added a commit to michaelosthege/pymc that referenced this issue Aug 10, 2021
This changes the initval default on Distribution.__new__ and Distribution.dist to UNSET.
It allows for implementing distribution-specific initial values similar to how it was done in pymc3 <4.
To simplify code that actually picks the initvals, a helper function was added.

Closes pymc-devs#4911 by enabling Model.set_initval to take symbolic initial values.
@fonnesbeck
Copy link
Member

I think the best way to set initial values is by passing them to sample since initial values are attributes of a given sampling run, and not of the model specification per se. I've always been uncomfortable with the initval keyword argument.

@brandonwillard
Copy link
Contributor

brandonwillard commented Aug 12, 2021

I think the best way to set initial values is by passing them to sample since initial values are attributes of a given sampling run, and not of the model specification per se. I've always been uncomfortable with the initval keyword argument.

Yeah, that would make sense, but we've needed initval as a replacement for things that were being done with test values in v3. Those things in v3 sometimes relate to the concept of initial values as a starting point for sampling and sometimes don't (e.g. Model.check_test_point).

We would need to forego some v3 parity in order to make them independent from the Model objects and/or remove initval from the Distribution interface. The majority of our v4 challenges are due to v3 parity demands, and this is one of the smaller examples.

@michaelosthege
Copy link
Member Author

I think the best way to set initial values is by passing them to sample since initial values are attributes of a given sampling run, and not of the model specification per se. I've always been uncomfortable with the initval keyword argument.

Yes and no.. They are attributes of a sampling run, but the creation of RVs is often the most convenient and comprehensible place to set the initial value (strategy) .
Not needing handles, names or even knowing about model variables when you "hit the inference button" is a strength of the PyMC3 API.

With #4918 in mind, maybe we should switch to a a paradigm of "initval strategy" and stop the eager evaluation in favor of one that has happens right at the start of the MCMC (or find_MAP)?
That way we could also re-draw initial values set to None | "random" on a per-chain basis, like discussed in #4752.

@ricardoV94
Copy link
Member

ricardoV94 commented Aug 12, 2021

While it's unfortunate that we have two ways of specifying start values for sampling, I think passing the dictionary to sample is way too cumbersome and error prone.

Users have to do it for all variables (I think) and they have to be aware of transformed names / variables which is very invisible to most users.

We would need to make the start point logic in sample more user friendly, and we would probably end up recreating most of the logic that is now inside the model.

That could be done, but at the cost of of more complexity somewhere else (and not less of it). Also there are some nice debugging methods like check_test_value that would also need to be refactored.

@brandonwillard
Copy link
Contributor

With #4918 in mind, maybe we should switch to a a paradigm of "initval strategy" and stop the eager evaluation in favor of one that has happens right at the start of the MCMC (or find_MAP)?
That way we could also re-draw initial values set to None | "random" on a per-chain basis, like discussed in #4752.

There is no "eager evaluation"; initial values are set (i.e. either given or sampled, currently) when a random variable is created. How is that "eager evaluation"? If anything, it could perhaps be called an "eager demand", because there's nothing about the new v4 framework that requires an initial value at that point—only a feature/parity-driven demand for one.

In other words, we're talking about when initial values need to be given/sampled/whatever, and that's currently determined by the things we want to support from v3.

@brandonwillard
Copy link
Contributor

That could be done, but at the cost of of more complexity somewhere else (and not less of it). Also there are some nice debugging methods like check_test_value that would also need to be refactored.

None of the initial value things we've discussed in this issue or the others needs to be complex. We already have a means of setting initial values directly on the more "user-friendly" random variables: Model.set_initval.

With some straightforward changes, this method can be used to set initial values whenever and wherever—and even automatically compute/sample different types of initial values for specific RandomVariable types via something like #4912.

For instance, the problem manufactured in #4918 is easily solved by resetting the initial values that were made inconsistent by the data.set_value call—after also correcting the other, unrelated problem caused by reusing a step instance with a different model. That's the existing interface, and it already works.

Otherwise, the dict of initial values produced by the Model.set_initval method could easily serve as the values that would be passed to pm.sample, or used internally, as they currently are. Having all this be somewhat less stateful (i.e. pass around the initial values instead of exclusively storing them in Model) would be nice, but it would require a few more changes throughout the codebase.

N.B. We also have a means of updating subsets of variables in a initial/start value dict: Model.update_start_vals.

@ricardoV94
Copy link
Member

I like the idea of using model.set_initval or something equivalent that is not necessarily attached to the model object.

@michaelosthege
Copy link
Member Author

There is no "eager evaluation"; initial values are set (i.e. either given or sampled, currently) when a random variable is created. How is that "eager evaluation"? If anything, it could perhaps be called an "eager demand", because there's nothing about the new v4 framework that requires an initial value at that point—only a feature/parity-driven demand for one.

Please remember: English is not my first language. The word "eager", is the translation of (among others) the German words "ungeduldig" (impatient) and "gierig" (greedy).

In other words, we're talking about when initial values need to be given/sampled/whatever

Exactly. We are talking about the same thing.
If you want to us to use a different word for it that's fine, but there's no need to bash me like that.


Choice of words aside, @brandonwillard what do you think of the idea to store None | Variable | ndarray valued initial values in Model.initial_values and only draw/evaluate them right before they're needed (e.g. for check_test_point, find_MAP or sample) ?

(I'll also prepare a PR that extracts that evaluation code into its own method.)

We also have a means of updating subsets of variables in a initial/start value dict: Model.update_start_vals.

A candidate for deprecation?

@brandonwillard
Copy link
Contributor

brandonwillard commented Aug 12, 2021

Please remember: English is not my first language. The word "eager", is the translation of (among others) the German words "ungeduldig" (impatient) and "gierig" (greedy).

Your choice of the word "eager" isn't any sort of issue, nor is the translation.

The issue is that you're apparently assuming there's some problem due to the fact that initial values are specified and/or used when they currently are, but there isn't an issue with that—relative to the things you referenced.

For instance, your use of "eager" is immediately preceded by statements and references to matters involving unrelated usage errors and how initial values are computed, and that's why it isn't clear that you're actually talking about the same things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants