...  @@ 85,7 +85,7 @@ After getting the mean of the projected data, we can calculate the variance of t 
...  @@ 85,7 +85,7 @@ After getting the mean of the projected data, we can calculate the variance of t 

1/N \sum_{n=1}^{N} (u_{1}^Tx_{n}  u_{1}^T\overline{x})^2 = u_{1}^TSu_{1}


1/N \sum_{n=1}^{N} (u_{1}^Tx_{n}  u_{1}^T\overline{x})^2 = u_{1}^TSu_{1}


```


```






We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`u_1`$ would go to $`\infty`$. As a solution, we add a constraint to the solution,by saying that $`u_{1}`$ is a unit vector ($`u_{1}^T\u_{1}=1`$) and enforcing via a [Lagrange multiplier](https://en.wikipedia.org/wiki/Lagrange_multiplier):


We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`u_1`$ would go to $`\infty`$. As a solution, we add a constraint,by saying (i) $`u_{1}`$ is a unit vector ($`u_{1}^T\u_{1}=1`$) and (ii) enforcing via a [Lagrange multiplier](https://en.wikipedia.org/wiki/Lagrange_multiplier):






```math


```math


u_{1}^TSu_{1}  \lambda_1(u_{1}^Tu_{1}1)


u_{1}^TSu_{1}  \lambda_1(u_{1}^Tu_{1}1)

...   ...  