... | ... | @@ -75,7 +75,19 @@ After getting the mean of the projected data, we can calculate the variance of t |
|
|
1/N \sum_{n=1}^{N} (u_{1}^Tx_{n} - u_{1}^T\overline{x})^2 = u_{1}^TSu_{1}
|
|
|
```
|
|
|
|
|
|
We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`||u_1||`$ would go to $`\infty`$.
|
|
|
We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`||u_1||`$ would go to $`\infty`$. As a solution, we add a constraint to the solution,by saying that $`u_{1}`$ is a unit vector ($`u_{1}^T\u_{1}=1`$) and enforcing via a [Lagrange multiplier](https://en.wikipedia.org/wiki/Lagrange_multiplier):
|
|
|
|
|
|
```math
|
|
|
u_{1}^TSu_{1} - \lambda_1(u_{1}^Tu_{1}-1)
|
|
|
```
|
|
|
|
|
|
At this stage, we can go into optimization: we take the derivative with respect to $`u_{1}`$ and set it equal to zero:
|
|
|
|
|
|
```math
|
|
|
Su_{1} - \lambda_1u_{1} = 0
|
|
|
|
|
|
(S-\lambda_1I)u_{1} = 0 where I is the identity matrix.
|
|
|
```
|
|
|
|
|
|
|
|
|
|
... | ... | |