... | ... | @@ -103,7 +103,38 @@ In other words, what is to be maximized is the eigenvalue, where $`u_{1}`$ is t |
|
|
|
|
|
For M principle components, we just need to calculate the eigenvectors and eigenvalues of the covariance matrix S.
|
|
|
|
|
|
|
|
|
### Independent Component Analysis
|
|
|
|
|
|
ICA is a decomposition method and is one of the blind source separation methods. The term blind is a good choice here, as the method does not require much about the underlying patterns. The components that constitute the measured data is considered as independent from one another at a given instance. This perspective provides us a method to separate these components from one another: we just need to figure out where to “cut” based on a measure of independency. So, we say, if the these components (you can imagine them like signals) are statistically independent
|
|
|
of each other, they should provide no “information” regarding to the other components.
|
|
|
|
|
|
Lets go over a classical example together: the "cocktail party problem". Imagine that there are two groups of people in the room, talking, gossiping. There are two microphones hidden close to each group. Naturally, the microphones would record both the talks, overlapping each other. The question is, can we separate these two signals (S) using both measurements (X), so that I would now who said what:
|
|
|
|
|
|
```math
|
|
|
x_1 = a_{11}s_1 + a_{12}s_2
|
|
|
```
|
|
|
|
|
|
```math
|
|
|
x_2 = a_{21}s_1 + a_{22}s_2
|
|
|
```
|
|
|
|
|
|
where $`a_{ij}`$ is the mixing coefficient. Well, it looks difficult as we only now the measurements (X); the rest is unknown. Here a is also very complex as it would be depending on the recorders, where they are, the properties of the sources and the environment. So, our aim is to approximate the sources from the data. Herein, “statistically independent” assumption becomes a very useful constraint (we will discuss more about decomposition more in DMD discussions –DDE II.) The method is extremely useful in signal analysis (EEG is a good example), radar applications, time series analysis, sensory data modelling.
|
|
|
|
|
|
In order to extract N sources, we need to process N measurements that are assumed to be the linear combinations of the all sources:
|
|
|
|
|
|
```math
|
|
|
x_i = a_{i1}s_1 + a_{i2}s_2 + ... + a_{iN}s_N
|
|
|
```
|
|
|
|
|
|
```math
|
|
|
X = AS
|
|
|
```
|
|
|
|
|
|
Note that we do not know neither A nor S. We also want to perform this decomposition for any problem, so the constraint we will apply should be generalizable.
|
|
|
|
|
|
We have already discussed a component analysis method, PCA. At this point, you may ask what is the difference between PCA and ICA? First of all, note that what we aim here is different. In PCA, we aim maximum variance, –a weaker constraint than the independency. In PCA, what is independent from one another is the principle components, while it may carry information from more than one source dimension –and this is typically the case as we reflect multiple coordinates into principle components. It is better seen on a plot:
|
|
|
|
|
|
|
|
|
|
|
|
...
|
|
|
|
... | ... | |