...  ...  @@ 17,7 +17,7 @@ The machine learning models we have covered so far can also be interpreted from 





Many real life observations include nonlinear behavior, hence the density distributions of the data in feature space usually cannot be described by single distribution functions such as Gaussian distribution. In engineering, it is a common practice to convert a finite, nonlinear system into an infinite combination of linear functions so that we can analyze and predict the system behavior. For instance, transient heat/mass/momentum transfer with source terms:






IMAGE



<img src="uploads/10baf30be90723e9cc23730b2309f920/dmd_1.png" width="500">






The same strategy can also be applied in data driven learning. We can, for instance, approximate a nonlinear observation distribution as a linear combinations of basic distribution functions, as in the case of Gaussian mixtures. With this linearization, we can further convert the observed variables X into discrete latent variables. In GMM, for instance, we create latent variables by assigning the observations to specific components of the mixture model (via EM algorithm).




...  ...  