...  ...  @@ 82,7 +82,15 @@ If you want to learn more about the model, you can follow the details from the [ 





In both classification and clustering models, we talked about "similarities" between instances, as well as the "distances". It is important to take a breath at this point and think about what they mean in our mathematical framework of the models.






...



Any example we pass to our models is vector, where each element corresponds to a feature. Our objective is usually to figure out how these observations (vectors) are distributed to our data space, and/or whether they follow certain patterns (hyperplane). In regression, we make guesses about the label by comparing our model predictions with true values by using distance based measures such as l1, l2 norms. In classification, we try to figure out special decision boundaries that divide dataspace into meaningful fractions. Here the labels provide us an absolute reference of frame, a way to compare. This is done again based on how similar / distant one class from the other. The same is true for clustering, but this time we look at our samples in a relative frame of reference.






Therefore, predictive / descriptive capabilities of the models is strongly affected by how we define the distance, e.g., l1 or l2 norm in regression. This is also the case in clustering.






One of the most common measures is the Euclidean distance, giving the dissimilarity between the features (m=1,2,..M) of two instances:






```math



Distance((x_{i},x_{i'})) = \sum_{m=1}^{M} \Delta_m (x_{im},x_{i'm})



```






You can also check:



[17 types of similarity and dissimilarity measures](https://towardsdatascience.com/17typesofsimilarityanddissimilaritymeasuresusedindatascience3eb914d2681)

...  ...  