Kronecker GP #2731

jordan-melendez · 2017-11-23T23:24:34Z

Added the required classes for building a GP that takes advantage of a Kronecker-structured covariance matrix. See this discussion. This PR includes

Efficient Kronecker operations
A KroneckerNormal distribution
A MarginalKron GP (LatentKron should be easy to add soon too)
A Coregion kernel

Tests have not been implemented yet, but are coming.

twiecki · 2017-11-24T11:08:58Z

pymc3/distributions/multivariate.py



 __all__ = ['MvNormal', 'MvStudentT', 'Dirichlet',
           'Multinomial', 'Wishart', 'WishartBartlett',
-           'LKJCorr', 'LKJCholeskyCov', 'MatrixNormal']
+           'LKJCorr', 'LKJCholeskyCov', 'MatrixNormal', 'KroneckerNormal']


bwengals · 2017-11-25T21:16:38Z

pymc3/distributions/dist_math.py

@@ -346,3 +349,134 @@ def grad(self, inputs, grads):
        x_grad, = grads

        return [x_grad * self.grad_op(x)]
+
+
+class Eigh(tt.nlinalg.Eig):


Is this class here to fix the njobs issue mentioned on the discourse thread?

Yes, it seems to be working.

Is there a theano PR for this? Would be good to link to it.

I just started one here.

bwengals · 2017-11-25T21:19:40Z

pymc3/gp/gp.py

+        Returns the marginal likelihood distribution, given the input
+        locations `X` and the data `y`.
+        """
+        # if not isinstance(noise, Covariance):


Might be nice to include some additional basic argument checking here, like whether the Xs are in a list or tuple, and that the length is the same as the list of cov_funcs

Does list vs tuple matter? And yes, checking lengths is a good idea.

bwengals · 2017-11-25T21:21:32Z

pymc3/gp/gp.py

+            return pm.KroneckerNormal(name, mu=mu, covs=covs, noise=noise,
+                                      observed=y, **kwargs)
+        else:
+            # shape = infer_shape(Xs, kwargs.pop("shape", None))


may need to have shape be a list of shapes for each X?

I think the product of the shapes would do, what do you think?

bwengals · 2017-11-25T21:26:04Z

pymc3/gp/gp.py

+            mean_total = self.mean_func
+        if all(val in given for val in ['X', 'y', 'noise']):
+            X, y, noise = given['X'], given['y'], given['noise']
+            # if not isinstance(noise, Covariance):


Since noise for the Kronecker GP marginal model with Eigenvalue decomp is only the white noise term on the whole diagonal, it should probably follow the same syntax as gp.*Sparse implementations

Sure, I can change it to sigma to remain consistent.

Should I use sigma in KroneckerNormal as well?

bwengals · 2017-11-25T21:36:34Z

pymc3/math.py

+    return np.stack(np.meshgrid(*arrays, indexing='ij'), -1).reshape(-1, N)
+
+
+def kron_matrix_op(krons, m, op):


I think this is really nice

Thanks! Getting it working in Theano was much tricker than Numpy. If there are more efficient ways than scan and the nested functions that would be great to see.

I profiled model.logp and the gradient, and the scan here is the slowest part (~50% of computation time). I'm ok with saving this optimization for another PR though.

Oh thanks for looking into that! Would certainly like to check that out in some future PR.

heres where I was looking at that, down towards the bottom. You can run the theano profiler on pymc3 code. You can see that scan at the top.

Hi, I am trying to use KroneckerNormal directly (not through the GP interface) but unfortunately it seems prohibitively slow for my application (though it could be something I'm doing wrong, not a theano expert by any means). My profiling also implicated scan for ~50% of the computation time, but overall it was much slower than @bwengals example (I think ~267 seconds for 1000 evaluations of the likelihood). I tried it with data that is 1000 by 2, flattened; I define one covariance matrix (1000 by 1000) using a squared exponential covariance function that operates on a 5-dimensional input, and the second (2 by 2) is unconstrained and given an LKJChol prior. I'm trying to do MAP estimation and it takes a very long time. I was doing it through MatrixNormal before and it was pretty fast, but I wanted to be able to put additive noise (and was happy to find this is implemented here!). Unfortunately I can't share my code right now, but I could concoct a toy example later to share if that is helpful. Thanks for any input you all might have -- I simply am curious whether this is inherent to current implementation due to theano stuff, whether I might be doing something wrong, or whether the nature of what I'm doing is sufficiently different from the single-input GP examples that it's slowing it down.

@natalieklein the easiest way to get some support on this is probably to post it on our discourse site along with a complete example that demonstrates the issue.

Thanks! If anyone is interested, here's the discourse link: https://discourse.pymc.io/t/kroneckernormal-speed/1497

bwengals · 2017-11-25T21:40:51Z

This is awesome! And everything looks really good. Have you used this in your research yet? How is it?

fonnesbeck · 2017-11-25T23:12:03Z

pymc3/distributions/multivariate.py

+        # elif isinstance(shape, tuple):
+        #     shape = *shape, self.N.eval()
+        # else:
+        #     shape = self.N.eval()


Can all the above be deleted?

Yeah, I'll get rid of it.

jordan-melendez · 2017-11-26T01:11:18Z

I haven't used this in research yet, I'm still working out tests and bugs. Right now KroneckerNormal is tricky to test with the current build_model and pymc3_matches_scipy in test_distributions.py due to the list of (differently shaped) covariance matrices, so I had to hack that part together. Thus the tests in test_distributions_random.py will have to be similarly customized. Any ideas on how to proceed there?

junpenglao · 2017-11-26T07:38:21Z

pymc3/math.py

+    ----------
+    Ks: 2D array-like
+    """
+    return reduce(tt.slinalg.kron, Ks)


Isnt reduce only available in python 3?

It looks it's available in Python 2 as well.

Does it return list or generator in Python 3?

Neither. It returns a theano tensor object that is the result of the nested evaluations kron(kron(...kron(kron(Ks[0], Ks[1]), Ks[2]), ...), Ks[d]), that is, the Kronecker product of all the arrays in order.

bwengals · 2017-11-29T22:01:36Z

pymc3/gp/gp.py

@@ -793,7 +794,7 @@ def conditional(self, name, Xnew, pred_noise=False, given=None, **kwargs):
        return pm.MvNormal(name, mu=mu, chol=chol, shape=shape, **kwargs)


-@conditioned_vars(["Xs", "y", "noise"])
+@conditioned_vars(["Xs", "y", "sigma"])
 class MarginalKron:


Any reason you didn't subclass the base class (even though it doesn't do much)?

Only because at one point I had not implemented a cov_func, but instead only had a list of cov_funcs. Now that I've made one by multiplying the separate ones together, I guess there wouldn't be any harm in subclassing.

Sounds good to me

bwengals · 2017-11-29T22:02:34Z

pymc3/gp/gp.py

        else:
            Asq = tt.dot(A.T, A)
            cov = Km - Asq
            if pred_noise:
-                cov += noise*np.eye(cov.shape)
+                cov += sigma * np.eye(cov.shape)
        return mu, cov

    def conditional(self, name, Xnew, pred_noise=False, given=None, **kwargs):


What do you think of adding a predict method? Similar to the Marginal class. The predict method is there as a convenience when using the MAP point to do predictions. It should be pretty trivial to add since conditional is done.

Yes, it's on its way!

bwengals · 2017-11-29T22:07:01Z

pymc3/gp/gp.py


        # New points
-        Km = self.total_cov_func(Xnew, diag)
-        Knm = self.total_cov_func(cartesian(*Xs), Xnew)
+        Km = cov_total(Xnew, diag)


cov_total is here to support making additive models, so that a component GP can be used in isolation to make predictions (described here). It may not be straightforward to support this for Kronecker GPs, in which case it would be good to implement an __add__ method that raises an error, like in the TP class.

You're right, the sum of Kronecker-structured covariance matrices does not in general have a Kronecker structure, so this class wouldn't be closed under addition. In fact, the conditional method doesn't yield a KroneckerNormal either, so technically that could be inherited from Marginal if we didn't care about the speedups from _build_conditional.

Good call. I've seen this structure once in a paper, but definitely out of the scope of this PR

That's an interesting paper! It's too bad that the speedups only hold for the sum of two terms.

bwengals · 2018-01-22T18:31:21Z

This is a huge contribution. Let us know when you're ready for review!

jordan-melendez · 2018-01-22T18:54:48Z

I can't think of anything else to add at the moment, so review away!

junpenglao · 2018-01-25T07:13:09Z

There are some test fail.

bwengals · 2018-01-29T00:34:35Z

pymc3/distributions/multivariate.py

@@ -1292,3 +1295,221 @@ def logp(self, value):
        n = self.n
        norm = - 0.5 * m * n * pm.floatX(np.log(2 * np.pi))
        return norm - 0.5*trquaddist - m*half_collogdet - n*half_rowlogdet
+
+
+class KroneckerNormal(Continuous):


This supersedes MatrixNormal right? If so, it might be good to change MatrixNormal to just be an alias for KroneckerNormal, to reduce code duplication. If it happens it should probably be a separate PR.

It does in a sense. MatrixNormal RVs are matrices, whereas KroneckerNormal is flattened as in MvNormal. I'm open to reusing KroneckerNormal code but either reshaping the output or changing the way MatrixNormal works. Depends what you all think is best.

Ah yes that is true. I'm perfectly OK with leaving things as is.

bwengals · 2018-01-29T00:39:23Z

pymc3/gp/cov.py

+            index2 = index.T
+        else:
+            index2 = tt.cast(Xs, 'int32').T
+        return self.B[index, index2]


Requires that X has input_dim=1? If so, should specify in docstring

Yeah, or at least that len(active_dims) == 1. I added this to the notes in the docstring. Thanks!

bwengals · 2018-01-29T00:48:23Z

pymc3/gp/gp.py

+        super(MarginalKron, self).__init__(mean_func, cov_func)
+
+    def __add__(self, other):
+        raise TypeError("Kronecker-structured processes aren't additive")


Maybe something like "Efficient implementation of additive, Kronecker-structured processes not implemented". It's a bit pedantic... you can do additive modeling with them in pymc3, but without exploiting Kronecker properties.

bwengals · 2018-01-29T01:24:19Z

pymc3/tests/test_distributions_random.py

 )


 def pymc3_random(dist, paramdomains, ref_rand, valuedomain=Domain([0]),
-                 size=10000, alpha=0.05, fails=10, extra_args=None):
+                 size=10000, alpha=0.05, fails=10, extra_args=None,
+                 model_args={}):


Having model_args={} is making travis unhappy, because of this: https://stackoverflow.com/questions/1367883/python-methods-default-parameter-values-are-evaluated-once

Better to do model_args=None and give it an empty dict if it is None.

Fixing occurrences of this will get rid of the little (!) that shows up next to the travis builds

bwengals · 2018-01-29T01:26:48Z

pymc3/tests/test_gp.py

+        X = cartesian(X1, X21, X22)
+        M = np.array([[1, 2, 3], [2, 1, 2], [3, 2, 1]])
+        with pm.Model() as model:
+            # cov1 = 3 + pm.gp.cov.ExpQuad(1, 0.1) + M * pm.gp.cov.ExpQuad(1, 0.1) * M * pm.gp.cov.ExpQuad(1, 0.1)


should remove this and M if unused

bwengals · 2018-01-29T01:29:44Z

pymc3/gp/gp.py

+            fcond = gp.conditional("fcond", Xnew=Xnew)
+    """
+
+    def __init__(self, mean_func=Zero(), cov_funcs=[Constant(0.0)]):


I think you can change default to (Constant(0.0)) to get rid of potential issues with mutable defaults

bwengals · 2018-01-29T01:31:31Z

pymc3/tests/test_distributions.py

 class TestMatchesScipy(SeededTest):
    def pymc3_matches_scipy(self, pymc3_dist, domain, paramdomains, scipy_dist,
-                            decimal=None, extra_args=None):
+                            decimal=None, extra_args=None, scipy_args={}):


occurrence of mutable default

bwengals · 2018-01-29T01:47:14Z

I don't know what the error some of the travis build is about, it doesn't look related to your PR, and isn't occurring for me locally, or in other branches.

I think it is OK to just up the tolerance. Changing this line to

npt.assert_allclose(latent_logp, self.logp, atol=5)

should do the trick. The other travis errors look like they are the linter complaining about mutable default arguments. I left a comment where these are in your code.

bwengals · 2018-01-30T05:09:19Z

Tests are passing and coverage looks great! Do you happen to have a runnable example of Coregion in action?

junpenglao · 2018-02-11T17:20:44Z

Rebase needed...

bwengals · 2018-02-11T20:43:17Z

If it's ok with the other devs, I'd recommend just opening a new PR from the current master with your changes and a long commit message and we can merge that. But my git-fu is like 3/10

jordan-melendez · 2018-02-13T00:26:09Z

Sorry everyone.. this all stemmed from rebasing my own commits into a more coherent fashion, which did seem to work, but affected other commits in the process. I could rebase again and remove all commits that are not mine if that would fix the problem. I'm just wary of doing anything in fear of breaking more stuff.

ColCarroll · 2018-02-13T02:10:31Z

Would you like help making a clean branch? I would literally clone a new version of pymc3 somewhere else on the file system, create a new branch, then mv all the changed files into the new version.

…lure

jordan-melendez · 2018-02-14T00:42:11Z

Looks like we're back in business 😎 a git pull --rebase upstream master cleared things up.

junpenglao · 2018-02-14T06:24:01Z

There will be some conflict with #2847 and I think we should coordinate the merge. I suggest merging this first.
@gBokiau it would mean more work on your end, what do you think? You can also split your PR into multiple ones as you suggest.

gBokiau · 2018-02-14T20:06:56Z

@junpenglao Yes I think merging this first is the best approach. Shouldn't be much trouble applying the changes to this.

fonnesbeck · 2018-02-14T20:19:59Z

It would be nice to see a short example of usage in one of the GP notebooks.

junpenglao · 2018-02-15T06:39:37Z

thanks @jordan-melendez!

bwengals · 2018-02-15T18:57:25Z

yes thanks @jordan-melendez, this is awesome!

usptact · 2018-02-20T17:25:39Z

Nice work @jordan-melendez !

twiecki reviewed Nov 24, 2017

View reviewed changes

bwengals reviewed Nov 25, 2017

View reviewed changes

fonnesbeck reviewed Nov 25, 2017

View reviewed changes

junpenglao reviewed Nov 26, 2017

View reviewed changes

bwengals reviewed Nov 29, 2017

View reviewed changes

junpenglao modified the milestones: s, 3.4 Jan 22, 2018

junpenglao changed the title ~~[WIP] Kronecker GP~~ Kronecker GP Jan 25, 2018

junpenglao added the enhancements label Jan 27, 2018

bwengals reviewed Jan 29, 2018

View reviewed changes

jordan-melendez force-pushed the KroneckerGP branch from 673f10b to 8dc9b36 Compare February 6, 2018 16:13

jordan-melendez and others added 15 commits February 13, 2018 17:43

Added efficient Kronecker functions

68500f0

Added MarginalKron functionality

8437e80

Added Coregion kernel and Kron covariance

ad875f4

Added KroneckerNormal and switched to local Eigh

6324cf7

Added KroneckerNormal

5e99b02

Added MarginalKron

d973361

Added local fix of theano EighGrad

d2d0fae

Cleaned up documentation

84291c7

Added KroneckerNormal tests

ab96e18

Added Kronecker operations tests

177e039

Finalized KroneckerNormal distribution

783ecfd

Added tests for Kron, Coregion, and MarginalKron, and fix strange fai…

2239bf2

…lure

Added working KroneckerNormal random tests

a3521e8

Add logit parametrization of Bernoulli

71c5077

Added Kronecker GP PR features

ece6ee8

jordan-melendez force-pushed the KroneckerGP branch from 8dc9b36 to ece6ee8 Compare February 13, 2018 22:59

junpenglao merged commit a3af00d into pymc-devs:master Feb 15, 2018

junpenglao mentioned this pull request Feb 15, 2018

Add Kronecker GP doc #2860

Closed

3 tasks

		return np.stack(np.meshgrid(*arrays, indexing='ij'), -1).reshape(-1, N)


		def kron_matrix_op(krons, m, op):

Uh oh!

Kronecker GP #2731

Kronecker GP #2731

Uh oh!

Conversation

jordan-melendez commented Nov 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwengals Nov 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fonnesbeck Jul 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwengals commented Nov 25, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jordan-melendez commented Nov 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jordan-melendez Nov 30, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwengals Nov 29, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

jordan-melendez commented Nov 23, 2017 •

edited

Loading

bwengals Nov 25, 2017 •

edited

Loading

fonnesbeck Jul 11, 2018 •

edited

Loading

jordan-melendez commented Nov 26, 2017 •

edited

Loading

jordan-melendez Nov 30, 2017 •

edited

Loading

bwengals Nov 29, 2017 •

edited

Loading