Adaptive Control - Least-Squares Algorithm in Parameters Estimation

2016-09-08

Parameter estimation is one of the keystones in Adaptive control; the main idea of parameter estimation is to construct a parametric model and then use optimization methods to minimize the error between the true parameter and the estimation. Least-square algorithm is one of the common optimization methods.

Background

I take adaptive control course this semester, and in the homework after the second lecture, we are going to derive the formular for the estimated parameter vector $θ (t)$ at time $t$ .

Some terms we need to demonstrate the problem:

Linear Parametric model: $y (t) = θ^{* T} ϕ (t)$ .
$θ^{*}$ is the truth value of the parameters in the system, which is what we want to reach or approach. However we only have the knowledge of $y (t)$ and $θ (t)$ . So we are going to estimate $θ^{*}$ based on what are known ( $y$ , $ϕ$ and so on).
$y (t)$ is the output of the system, $1 \times 1$ scaler, which could be measured and it is known.
$ϕ (t)$ is the input or reference of the system, which is also known.
$θ (t)$ is the estimated system parameter at time $t$ , which is what we want.
Cost function $J (θ (t))$ is what we would like to minimized, so that the error between $θ ϕ$ and $y$ is minimized.

There are two algorithms that can solve this problem. The first one is Gradient (Descent) Algorithm, another one is what we are going to demonstrate here, Least-squares algorithm. Basically, the differences between these two algorithms are the cost function they employed and the method to minimize the cost function.

Problem

So here is our setting:

Given the cost function in Least-Squares algorithm,

J (θ (t)) = \frac{1}{2} \int_{0}^{t} | θ^{T} (t) ϕ (τ) - y (τ) |^{2} d τ

Take the partial derivative of $J (θ (t))$ , let it be 0, solve the equation

\frac{\partial J (θ)}{\partial θ} = 0

and derive that

θ (t) = [\int_{0}^{t} ϕ (τ) ϕ^{T} (τ) d τ]^{- 1} \int_{0}^{t} y (τ) ϕ (τ) d τ

Solution

\begin{array}{rcl} \frac{\partial J (θ)}{\partial θ} & = & \frac{1}{2} \frac{\partial}{\partial θ} {\int_{0}^{t} [[θ^{T} (t) ϕ (τ)]^{2} - 2 θ^{T} (t) ϕ (τ) y (τ) - y^{2} (τ)] d τ} \\ = & \frac{1}{2} \int_{0}^{t} \frac{\partial}{\partial θ} [θ^{T} (t) ϕ (τ)]^{2} d τ - \frac{1}{2} \int_{0}^{t} \frac{\partial}{\partial θ} [2 θ^{T} (t) ϕ (τ) y (τ)] d τ \\ = & \int_{0}^{t} θ^{T} (t) ϕ (τ) \frac{\partial}{\partial θ} [θ^{T} (t) ϕ (τ)] d τ - \int_{0}^{t} y (τ) ϕ (τ) d τ \\ = & \int_{0}^{t} θ^{T} (t) ϕ (τ) ϕ^{T} (τ) d τ - \int_{0}^{t} y (τ) ϕ (τ) d τ = 0 \end{array}

Therefore,

θ^{T} (t) [\int_{0}^{t} ϕ (τ) ϕ^{T} (τ) d τ] = \int_{0}^{t} y (τ) ϕ (τ) d τ

[\int_{0}^{t} ϕ (τ) ϕ^{T} (τ) d τ] θ (t) = \int_{0}^{t} y (τ) ϕ (τ) d τ

Finally,

θ (t) = [\int_{0}^{t} ϕ (τ) ϕ^{T} (τ) d τ]^{- 1} \int_{0}^{t} y (τ) ϕ (τ) d τ

(The right hand side is a column vector, so when taking $θ^{T} (t)$ out of the integral, it should transpose to become a column vector. And $ϕ (τ) ϕ^{T} (τ) = [ϕ (τ) ϕ^{T} (τ)]^{T}$ , because it is a symmetric matrix.)

Revision and Updates

Since the solution above required $\int_{0}^{t} ϕ (τ) ϕ^{T} (τ) d τ$ to be invertible, in some scenario where t is very small, usually the matrix is not full ranked and is not invertible. To deal with this situation, we revise the cost function as

J (θ (t)) = \frac{1}{2} \int_{0}^{t} | θ^{T} (t) ϕ (τ) - y (τ) |^{2} d τ + \frac{1}{2} [θ (t) - θ_{0}]^{T} Q_{0} [θ (t) - θ_{0}]

where $θ_{0} = θ (0)$ is the initial estimation of parameters, and $Q_{0} = Q_{0}^{T}$ is a symmetric and positive definite matrix.

We can do the same process like the solution above, and we achieve:

θ (t) = [Q_{0} + \int_{0}^{t} ϕ (τ) ϕ^{T} (τ) d τ]^{- 1} [Q_{0} θ_{0} + \int_{0}^{t} y (τ) ϕ (τ) d τ]

(Notice that: $\frac{\partial x^{T} A x}{\partial x} = (A x)^{T} + x^{T} A = x^{T} A^{T} + x^{T} A$ , where $x$ and $A$ are matrices, $A$ is not a function of $x$ . If A is a symmetric matrix, $A^{T} = A$ , then the result would be $2 x^{T} A$ or $2 x^{T} A^{T}$ .)