Skip to content

Negative covariance matrix in Unscented Kalman filter #377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Hami1369 opened this issue May 2, 2021 · 4 comments
Open

Negative covariance matrix in Unscented Kalman filter #377

Hami1369 opened this issue May 2, 2021 · 4 comments

Comments

@Hami1369
Copy link

Hami1369 commented May 2, 2021

I want to implement the Unscented Kalman filter(UKF) method for the nonlinear problem; I set all initial values such as initial mean vector and initial covariance matrix. The dimension of the problem is 8. Below is the code of implementation; I do not know why the new covariance matrix at the end of the code includes some negative parameters. These negative parameters make problems for the next iteration of the approach.

#pip install filterpy
import numpy as np
from filterpy.kalman import unscented_transform, MerweScaledSigmaPoints

#initialization 
mean_zero = np.array([ 27.25,14.39, 4.459, 27.65 ,  6.27 , 23.653,  1.2  ,  0.21 ])
P_zero = np.diagflat((0.05*mean_zero)**2)
# initial parameters
T_2 =10
T_3 = 8
dt = 1
Q_1 = 15
Q_2 = 25
T_1 = 12

#sigma points parameters
alpha = 0.1
beta = 2
kappa = 0
n = 8

#create sigma points and weights
points = MerweScaledSigmaPoints(n=8, alpha=.1, beta=2., kappa=0)
sigmas = points.sigma_points(mean_zero, P_zero)

#sigma points weights
w_lambda = alpha**2 * (n + kappa) - n
Wc = np.full(2*n+1 , 0.5/(n + w_lambda))
Wm = np.full(2*n+1 , 0.5/(n + w_lambda))
Wc[0] = w_lambda / (n + w_lambda) + (1 - alpha**2 + beta)
Wm[0] = w_lambda / (n + w_lambda)

#process noise
Q = np.diagflat(((1*10**-5)*mean_zero)**2)
Q_n = np.random.multivariate_normal(np.zeros(8),Q,8)

#mesurement noise
R = (9*10**-2)*np.eye(2)
R_n = np.random.multivariate_normal(np.zeros(2),R,2)

#nonlinear_state_function
def f_nonlinear_state(T_2,T_3,R_21,R_32,C_2,C_3,w_1,w_2):
    T_2 = T_2 + dt*(-1*(T_2/C_2)*(1/R_21 + 1/R_32) + T_1/R_21*C_2 + T_3/R_32*C_2)
    T_3 = T_3+ dt*(-1*(T_3/C_3)*(1/R_32)+T_2/R_32*C_3 +w_1*Q_1/C_3 + w_2*Q_2/C_3)
    R_21 = R_21
    R_32 = R_32
    C_2 = C_2
    C_3 = C_3
    w_1 = w_1
    w_2 = w_2
    z = np.array([T_2 , T_3,R_21 , R_32,C_2 ,C_3 ,w_1,w_2])
    return z

#passing sigma points through nonlinear_state_function
sigmas_f = np.empty((17, 8))
for i in range(17):
    sigmas_f[i] = f_nonlinear_state(sigmas[i, 0], sigmas[i ,1],sigmas[i, 2],sigmas[i, 3],sigmas[i, 4],sigmas[i, 5],sigmas[i, 6],sigmas[i, 7])

ukf_f_mean, ukf_f_cov = unscented_transform(sigmas_f, points.Wm, points.Wc)

#nonlinear mesurement function
def h_mesurement(T_2,T_3):
    T_2 = T_2
    T_3 = T_3
    y = np.array([T_2,T_3])
    return y
#passing sigmas_f through mesurement function 
sigmas_h = np.empty((17, 2))
for i in range(17):
    sigmas_h[i] = h_mesurement(sigmas_f[i, 0], sigmas_f[i ,1])

ukf_h_mean, ukf_h_cov = unscented_transform(sigmas_h, points.Wm, points.Wc)

#cross covarinace
Pfh = np.zeros((8, 2))
for i in range(17):
    Pfh += Wc[i] * np.outer(sigmas_f[i] - ukf_f_mean,sigmas_h[i] - ukf_h_mean)
#Kalman gain
K = np.dot(Pfh, np.linalg.inv(ukf_h_cov)) 

#True value of the estimate 
h = np.array([47.39642954, 55.42371109]) 

#New mean and New covariance 

New_mean = ukf_f_mean + np.dot(K , h-ukf_h_mean ) 
New_covarince = ukf_f_cov - np.dot(K,ukf_h_cov).dot(K.T)
New_covarince
@rlabbe
Copy link
Owner

rlabbe commented May 2, 2021

This code is very unreadable; if you have 4 spaces before each line github will transform it into valid python.

I don't understand your problem construction. You have a state vector with 8 components, but only compute the first two. So the unscented transform is creating sigma points around R21, R32, etc, but your function doesn't compute new values for them. It's not surprising that the code finds negative correlations between the various values.

I can't be sure, I had to guess at the meaning of the code giving the bad formatting and numerous unused variables, but that is where I would look. These are also subject to numerical instability in general, especially when your initial covariance is vastly different from the first computed covariance. It never hurts to compute P = 0.5*([email protected]) to make it positive semi-definite again.

@Hami1369
Copy link
Author

Hami1369 commented May 3, 2021

Thank you very much rlabbe for your comment, I have edited the code for becoming into valid python.

Regarding the computing of 8 components, the nonlinear function just computes T_2, T_3 and other parameters do not change in the nonlinear function.
the nonlinear functions are nonlinear continuous-time state equation where an additive process noise is assumed
image
image

After Transforming to discrete time I have:

image
image
which
X = T_2 , T_3
p = R_21,R_32,C_2,C_3,w_1,w_2

the state function comes from
image

as you can see just T_2 and T_3 have the nonlinear equations .

and the measurement function equlas to
h = T_2 , T_3

@Hami1369
Copy link
Author

Hami1369 commented May 3, 2021

I have another approach for creating nonlinear functions, describing below :
#pip install filterpy
import numpy as np
from filterpy.kalman import unscented_transform, MerweScaledSigmaPoints

#initialization 
mean_zero = np.array([ 4.459, 27.65 ,  6.27 , 23.653,  1.2  ,  0.21 ])
P_zero = np.diagflat((0.05*mean_zero)**2)
# initial parameters
T_2 =10
T_3 = 8
dt = 1
Q_1 = 15
Q_2 = 25
T_1 = 12

#sigma points parameters
alpha = 0.1
beta = 2
kappa = 0
n = 8

#create sigma points and weights
points = MerweScaledSigmaPoints(n=6, alpha=.1, beta=2., kappa=0)
sigmas = points.sigma_points(mean_zero, P_zero)

#sigma points weights
w_lambda = alpha**2 * (n + kappa) - n
Wc = np.full(2*n+1 , 0.5/(n + w_lambda))
Wm = np.full(2*n+1 , 0.5/(n + w_lambda))
Wc[0] = w_lambda / (n + w_lambda) + (1 - alpha**2 + beta)
Wm[0] = w_lambda / (n + w_lambda)

#process noise
Q = np.diagflat(((1*10**-5)*mean_zero)**2)
Q_n = np.random.multivariate_normal(np.zeros(6),Q,6)

#mesurement noise
R = (9*10**-2)*np.eye(2)
R_n = np.random.multivariate_normal(np.zeros(2),R,2)

def f_nonlinear_state(R_21,R_32,C_2,C_3,w_1,w_2):
    R_21 = R_21
    R_32 = R_32
    C_2 = C_2
    C_3 = C_3
    w_1 = w_1
    w_2 = w_2
    z = np.array([R_21 , R_32,C_2 ,C_3 ,w_1,w_2])
    return z

#passing sigma points through nonlinear_state_function
sigmas_f = np.empty((13, 6))
for i in range(13):
    sigmas_f[i] = f_nonlinear_state(sigmas[i, 0], sigmas[i ,1],sigmas[i, 2],sigmas[i, 3],sigmas[i, 4],sigmas[i, 5])

ukf_f_mean, ukf_f_cov = unscented_transform(sigmas_f, points.Wm, points.Wc)

#nonlinear mesurement function
def h_mesurement(T_2,T_3,R_21,R_32,C_2,C_3,w_1,w_2):
    T_2 = T_2 + dt*(-1*(T_2/C_2)*(1/R_21 + 1/R_32) + T_1/R_21*C_2 + T_3/R_32*C_2)
    T_3 = T_3+ dt*(-1*(T_3/C_3)*(1/R_32)+T_2/R_32*C_3 +w_1*Q_1/C_3 + w_2*Q_2/C_3)
    y = np.array([T_2,T_3])
    return y
#passing sigmas_f through mesurement function 
sigmas_h = np.empty((13, 2))
for i in range(13):
    sigmas_h[i] = h_mesurement(T_2,T_3,sigmas_f[i, 0], sigmas_f[i ,1],sigmas_f[i, 2],sigmas_f[i, 3],sigmas_f[i, 4],sigmas_f[i, 5])

ukf_h_mean, ukf_h_cov = unscented_transform(sigmas_h, points.Wm, points.Wc)

#cross covarinace
Pfh = np.zeros((6, 2))
for i in range(13):
    Pfh += Wc[i] * np.outer(sigmas_f[i] - ukf_f_mean,sigmas_h[i] - ukf_h_mean)
#Kalman gain
K = np.dot(Pfh, np.linalg.inv(ukf_h_cov)) 

#True value of the estimate 
h = np.array([47.39642954, 55.42371109]) 

#New mean and New covariance 

New_mean = ukf_f_mean + np.dot(K , h-ukf_h_mean ) 
New_covarince = ukf_f_cov - np.dot(K,ukf_h_cov).dot(K.T)
New_covarince

below are the assumption of nonlinear functions :
image
image
U = T_1 , Q_1, Q_2 which are known input parameters

@rlabbe
Copy link
Owner

rlabbe commented May 4, 2021

Hi, I don't understand the motivation for the second set of code. The KF is not estimating any parameters, hence K is meaningless, and you certainly can't apply it to parameters you are not estimating.

I've only used the UKF for small dimensional problems. This post suggests that the UKF can have non-positive semidefinite covariances when the dimension is > 3. https://math.stackexchange.com/questions/796331/scaling-factor-and-weights-in-unscented-transform-ukf

I'll reiterate my concern of you having 6 coefficients in the state vector which do not vary. Your covariance is negative when computing the covariance between these values. If they don't vary in the real world then they probably shouldn't be in the state vector; if they do try adding a model for how they vary. I don't think that will change the issue with high dimensionality; I'm just addressing the KF design.

Wan and van der Merwe's paper suggest evaluating the covariance over to ensure positive semi-definite results. I tried it but it didn't seem to work. It is possible I misunderstood what they meant and/or implemented it incorrectly.

I think if you make your state [T2, T3] and treat the rest as constants the UKF will work as you expect.

kierandwn pushed a commit to kierandwn/Kalman-and-Bayesian-Filters-in-Python that referenced this issue May 9, 2021
Somehow this didn't get checked in for commit
91f8010. Remove
dt=dt from plot_track calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants