em0787 · 81893028
--- a/DDE-1/Dimensionality-reduction.md
+++ b/DDE-1/Dimensionality-reduction.md
@@ -75,7 +75,19 @@ After getting the mean of the projected data, we can calculate the variance of t
 1/N \sum_{n=1}^{N} (u_{1}^Tx_{n} - u_{1}^T\overline{x})^2 = u_{1}^TSu_{1}
 ```

-We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`||u_1||`$ would go to $`\infty`$.
+We now have a definition of the variance of the data projected on $`u_1`$. We are ready to maximize it. But maximization is not an easy optimization problem. If we simply try to maximize the above equation, $`||u_1||`$ would go to $`\infty`$. As a solution, we add a constraint to the solution,by saying that $`u_{1}`$ is a unit vector ($`u_{1}^T\u_{1}=1`$) and enforcing via a [Lagrange multiplier](https://en.wikipedia.org/wiki/Lagrange_multiplier):
+
+```math
+u_{1}^TSu_{1} - \lambda_1(u_{1}^Tu_{1}-1)
+```
+
+At this stage, we can go into optimization: we take the derivative with respect to $`u_{1}`$ and set it equal to zero:
+
+```math
+Su_{1} - \lambda_1u_{1} = 0
+
+(S-\lambda_1I)u_{1} = 0 where I is the identity matrix.
+```