Thursday, October 23, 2014
Derivatives of Matrix to Vector/Scalar
$$y=Hx+n$$
$$y=\begin{bmatrix}y_1\\y_2\\...\\y_n\end{bmatrix}=\begin{bmatrix}a_{11}&a_{12}&...&a_{1n}\\a_{21}&a_{22}&...& a_{2n}\\...\\a_{n1}&a_{n2}&...& a_{nn}\end{bmatrix}\begin{bmatrix}x_1\\x_2\\...\\x_n\end{bmatrix}+\begin{bmatrix}n_1\\n_2\\...\\n_n \end{bmatrix}$$
$$\min_x((y-Hx)^2)$$
$$E=(y-Hx)^2=(y-Hx)^T(y-Hx)=y^Ty-y^THx-x^TH^Ty-x^TH^THx$$
$$\frac{dE}{dx}=-y^TH-y^TH+2x^TH^TH=0$$
$$(HH^T)x=2H^Ty$$
$$x=(HH^T)^{-1}H^Ty$$
The classical problem and solution is quite popular in the mathematic related field.
To understand the solution in depth, we need to know some equations below:
1. Derivative of a matrix w.r.t a scalar
Suppose that Y is a matrix NxM size and x is a scalar variable
$$\partial{Y}/\partial{x}=\begin{bmatrix}\partial{y_{11}}/\partial{x}&\partial{y_{12}}/\partial{x}&...&\partial{y_{1K}}/\partial{x}\\\partial{y_{21}}/\partial{x}&\partial{y_{22}}/\partial{x}&...&\partial{y_{2K}}/\partial{x}\\...\\\partial{y_{n1}}/\partial{x}&\partial{y_{N2}}/\partial{x}&...&\partial{y_{NK}}/\partial{x}\\\end{bmatrix}$$
2. Derivative of a vector w.r.t a scalar
Suppose that Y is a vector Nx1 size
$$\partial{Y}/\partial{x}=\begin{bmatrix}\partial{y_{1}}/\partial{x}\\\partial{y_{2}}/\partial{x}\\...\\\partial{y_{N}}/\partial{x}\end{bmatrix}$$
2. Derivative of a scalar w.r.t a matrix
$$\partial{x}/\partial{Y}=\begin{bmatrix}\partial{x}/\partial{y_{11}}&\partial{x}/\partial{y_{21}}&...&\partial{x}/\partial{y_{K1}}\\\partial{x}/\partial{y_{12}}&\partial{x}/\partial{y_{22}}&...&\partial{x}/\partial{y_{K2}}\\...\\\partial{x}/\partial{y_{1N}}&\partial{x}/\partial{y_{2N}}&...&\partial{x}/\partial{y_{KN}}\end{bmatrix}$$
3. Derivative of a scalar w.r.t a vector
$$\partial{x}/\partial{Y}=\begin{bmatrix}\partial{x}/\partial{y_{1}}&\partial{x}/\partial{y_{2}}&...&\partial{x}/\partial{y_{K}}\end{bmatrix}$$
3. Derivative of a vector w.r.t a vector
$$\partial{Y}/\partial{X}=\begin{bmatrix}\partial{y_1}/\partial{x_1}&\partial{y_1}/\partial{x_2}&...&\partial{y_{1}}/\partial{x_N}\\\partial{y_2}/\partial{x_2}&\partial{y_2}/\partial{x_2}&...&\partial{y_2}/\partial{x_N}\\...\\\partial{y_{N}}/\partial{x_1}&\partial{y_{N}}/\partial{x}_2&...&\partial{y_{N}}/\partial{x_N}\\\end{bmatrix}$$
4. Derivative of a matrix w.r.t a vector
$$\partial{X}/\partial{Y}=\begin{bmatrix}\partial{X}/\partial{y_{1}}&\partial{X}/\partial{y_{2}}&...&\partial{X}/\partial{y_{K}}\end{bmatrix}$$
5.Properties:
$$\partial{(x^HA)}/\partial{x}=A^H$$
$$\partial{(Ax)}/\partial{x}=A$$
$$\partial{(x^HAx)}/\partial{x}=2x^HA$$
Labels:
theory
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment