- David Procházka/
- Paper Notes/
- Empirical comparison of fast clustering algorithms for large data sets (2000)/
Empirical comparison of fast clustering algorithms for large data sets (2000)
·1 min
- contains pseudocodes for CLARA, CLARANS, GAC-R$^3$ and GAC-RAR$_w$ (genetic search heuristics)
- CLARANS – serial randomized search strategy to find the optimal set of medoids
- Evaluation:
- CLARANS the best in clustering quality and execution time when:
- “… the number of clusters increases, clusters are more closely related, more asymmetric clusters are present, or more random objects exist in the data set.”
Wei, Chih-Ping, Yen-Hsien Lee, and Che-Ming Hsu. “Empirical comparison of fast clustering algorithms for large data sets.” Proceedings of the 33rd annual hawaii international conference on system sciences. IEEE, 2000.