Clustering Machine Learning Unsupervised Learning
January 16, 2021 by Monis Khan | Leave a Comment
Following are the difference between K-Means and Hierarchical Clustering Algorithm (HCA) K-Means is that it needs us to pre-enter the number of clusters (K) but Hierarchical clustering has no such requirements. The algorithm on itself deduces the optimum number of cluster and displays it form of dendrogram. Performance of K-Means on spherical data is better […]
Read more »
Following are the methods of linkage used by Hierarchical clustering algorithm: Single Linkage: It is a distance based criteria and measures minimum pairwise distance. Complete Linkage: It is a distance based criteria and measures maximum pairwise distance. Centroid Linkage: It is a distance based criteria and measures distance between centroids. Ward’s Linkage: It is a […]
.... Clustering Machine Learning Unsupervised Learning
January 16, 2021 / January 16, 2021 by Monis Khan | Leave a Comment
Hierarchical clustering algorithm adopts agglomerative learning approach. Following are the steps involved in training process of this algorithm: For the first iteration it starts with points and finds the closest pairs and combines them into single cluster. For later iterations it does the same with clusters. The process goes on till all the points are […]
Dendrograms depict the distance between clusters. The y-axis of a dendrogram depicts dissimilarity between two clusters. They are used to find the optimum number of clusters for a Hierarchical clustering algorithm.
Following are the two approaches used by clustering algorithms: Agglomerative: The algorithms starts with assuming each individual point as a cluster and proceeds with adding the nearest point to the existing cluster. Thus similarity is used as a measure for creating new cluster. If not stopped, as per business requirement, the algorithm would go on […]
Clustering Unsupervised Learning
Following were the improvements made in K-means: Since initial centroids are arbitrarily chosen, the results for earlier versions were not exactly replicable. K-Means++ proposed a new method of assigning centroids where the initial points were furthest from each other i.e. fixed and replicable locations. Earlier versions were computationally expensive as distance was to be calculated […]
Following are the challenges faced by K-Means Clustering: k-Means doesn’t perform well if the clusters have varying sizes, different densities, or non-spherical shapes. Has to be run for a certain amount of iteration or it would produce a suboptimal result. Computationally expensive as distance is to be calculated from each centroid to all data points. […]
Following are the steps taken by K-Means algorithm in its learning process: User specifies the value of K using the elbow curve. For the first iteration, K number of points are arbitrarily chosen. Let’s call them centroids for the time being, in below steps you’ll get the reason for this nomenclature. In case of K-Means++, […]
January 15, 2021 / January 16, 2021 by Monis Khan | Leave a Comment
In K-Means algorithm we need to specify the number of clusters. As is the inherent nature of the algorithm, with increase in number of clusters the WCSS decreases. But the rate of decrease is steep for first few points and then plateaus. Thus the curve resembles the shape of human elbow, hence the name. The […]
January 15, 2021 by Monis Khan | Leave a Comment