Data clustering is the process of placing data items into groups so that items within a group are similar and items in different groups are dissimilar. The most common technique for clustering numeric ...
A k-means-type algorithm is proposed for efficiently clustering data constrained to lie on the surface of a p-dimensional unit sphere, or data that are mean-zero-unit-variance standardized ...
In this paper, the authors contain a partitional based algorithm for clustering high-dimensional objects in subspaces for iris gene dataset. In high dimensional data, clusters of objects often exist ...
Multivariate analysis in statistics is a set of useful methods for analyzing data when there are more than one variable under consideration. Multivariate analysis techniques may be used for several ...
Statistica Sinica, Vol. 12, No. 1, A Special Issue on Bioinformatics (January 2002), pp. 241-262 (22 pages) Many clustering algorithms have been used to analyze microarray gene expression data. Given ...
K-means is comparatively simple and works well with large datasets, but it assumes clusters are circular/spherical in shape, so it can only find simple cluster geometries. Data clustering is the ...