...  ...  @@ 75,7 +75,19 @@ After getting the mean of the projected data, we can calculate the variance of t 


1/N \sum_{n=1}^{N} (u_{1}^Tx_{n}  u_{1}^T\overline{x})^2 = u_{1}^TSu_{1}



```






We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`u_1`$ would go to $`\infty`$.



We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`u_1`$ would go to $`\infty`$. As a solution, we add a constraint to the solution,by saying that $`u_{1}`$ is a unit vector ($`u_{1}^T\u_{1}=1`$) and enforcing via a [Lagrange multiplier](https://en.wikipedia.org/wiki/Lagrange_multiplier):






```math



u_{1}^TSu_{1}  \lambda_1(u_{1}^Tu_{1}1)



```






At this stage, we can go into optimization: we take the derivative with respect to $`u_{1}`$ and set it equal to zero:






```math



Su_{1}  \lambda_1u_{1} = 0






(S\lambda_1I)u_{1} = 0 where I is the identity matrix.



```










...  ...  