Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-01-08T09:52:46.997Z Has data issue: false hasContentIssue false

A Comparison of Heuristic Procedures for Minimum Within-Cluster Sums of Squares Partitioning

Published online by Cambridge University Press:  01 January 2025

Michael J. Brusco*
Affiliation:
Florida State University
Douglas Steinley
Affiliation:
University of Missouri-Columbia
*
Requests for reprints should be sent to Michael J. Brusco, Department of Marketing, College of Business, Florida State University, Tallahassee, FL 32306-1110, USA. E-mail: [email protected]

Abstract

Perhaps the most common criterion for partitioning a data set is the minimization of the within-cluster sums of squared deviation from cluster centroids. Although optimal solution procedures for within-cluster sums of squares (WCSS) partitioning are computationally feasible for small data sets, heuristic procedures are required for most practical applications in the behavioral sciences. We compared the performances of nine prominent heuristic procedures for WCSS partitioning across 324 simulated data sets representative of a broad spectrum of test conditions. Performance comparisons focused on both percentage deviation from the “best-found” WCSS values, as well as recovery of true cluster structure. A real-coded genetic algorithm and variable neighborhood search heuristic were the most effective methods; however, a straightforward two-stage heuristic algorithm, HK-means, also yielded exceptional performance. A follow-up experiment using 13 empirical data sets from the clustering literature generally supported the results of the experiment using simulated data. Our findings have important implications for behavioral science researchers, whose theoretical conclusions could be adversely affected by poor algorithmic performances.

Type
Theory and Methods
Copyright
Copyright © 2007 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Al-Sultan, K. (1995). A tabu search approach to the clustering problem. Pattern Recognition, 28, 14431451.CrossRefGoogle Scholar
Anderson, E. (1935). The irises of the Gaspé peninsula. Bulletin of the American Iris Society, 59, 25.Google Scholar
Arabie, P., Hubert, L. (1992). Combinatorial data analysis. Annual Review of Psychology, 43, 169203.CrossRefGoogle Scholar
Arabie, P., Hubert, L.J. (1996). An overview of combinatorial data analysis. In Arabie, P., Hubert, L.J., De Soete, G. (Eds.), Clustering and classification (pp. 563). River Edge: World Scientific.CrossRefGoogle Scholar
Babu, G.P., Murty, M.N. (1993). A near optimal initial seed value selection in k-means algorithm using genetic algorithms. Pattern Recognition Letters, 14, 763769.CrossRefGoogle Scholar
Babu, G.P., Murty, M.N. (1994). Simulated annealing for selecting optimal initial seeds in the K-means algorithm. Indian Journal of Pure and Applied Mathematics, 25, 8594.Google Scholar
Banfield, C.F., Bassil, L.C. (1977). A transfer algorithm for nonhierarchical classification. Applied Statistics, 26, 206210.CrossRefGoogle Scholar
Banfield, J.D., Raftery, A.E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics, 49, 803821.CrossRefGoogle Scholar
Belew, R.K., Booker, J.B. (1991). Proceedings of the fourth international conference on genetic algorithms, San Mateo: Morgan-Kaufmann.Google Scholar
Brucker, F. (1978). On the complexity of clustering problems. In Beckmann, M., Kunzi, H.P. (Eds.), Optimization and operations research (pp. 4554). Heidelberg: Springer.CrossRefGoogle Scholar
Brusco, M.J. (2004). Clustering binary data in the presence of masking variables. Psychological Methods, 9, 510523.CrossRefGoogle ScholarPubMed
Brusco, M.J. (2006). A repetitive branch-and-bound procedure for minimum within-cluster sums of squares partitioning. Psychometrika, 71, 347363.CrossRefGoogle ScholarPubMed
Brusco, M.J., Cradit, J.D. (2001). A variable selection heuristic for k-means cluster analysis. Psychometrika, 66, 249270.CrossRefGoogle Scholar
Brusco, M.J., Cradit, J.D., Tashchian, A. (2003). Multicriterion clusterwise regression for joint segmentation settings: An application to customer value. Journal of Marketing Research, 40, 225234.CrossRefGoogle Scholar
Cerny, V. (1985). A thermodynamical approach to the traveling salesman problem. Journal of Optimization Theory and Applications, 45, 4151.CrossRefGoogle Scholar
Day, W.H.E. (1996). Complexity theory: An introduction for practitioners of classification. In Arabie, P., Hubert, L.J., De Soete, G. (Eds.), Clustering and classification (pp. 199233). River Edge: World Scientific.CrossRefGoogle Scholar
Diehr, G. (1985). Evaluation of a branch and bound algorithm for clustering. SIAM Journal for Scientific and Statistical Computing, 6, 268284.CrossRefGoogle Scholar
Dimitriadou, E., Dolniĉar, S., Weingessel, A. (2002). An examination of indexes for determining the number of clusters in binary data sets. Psychometrika, 67, 137160.CrossRefGoogle Scholar
du Merle, O., Hansen, P., Jaumard, B., Mladenoviĉ, N. (2000). An interior point algorithm for minimum sum-of-squares clustering. SIAM Journal on Scientific Computing, 21, 14851505.CrossRefGoogle Scholar
Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179188.CrossRefGoogle Scholar
Forgy, E.W. (1965). Cluster analyses of multivariate data: Efficiency versus interpretability of classifications. Biometrics, 21, 768.Google Scholar
Forrest, S. (1993). Proceedings of the fifth international conference on genetic algorithms, San Mateo: Morgan-Kaufmann.Google Scholar
Glover, F. (1989). Tabu search—Part I. ORSA Journal on Computing, 1, 190206.CrossRefGoogle Scholar
Glover, F. (1990). Tabu search—Part II. ORSA Journal on Computing, 2, 432.CrossRefGoogle Scholar
Glover, F., Laguna, M. (1993). Tabu search. In Reeves, C. (Eds.), Modern heuristic techniques for combinatorial problems (pp. 70141). Oxford: Blackwell.Google Scholar
Glover, F., Taillard, E., Werra, D. (1993). A user’s guide to tabu search. Annals of Operations Research, 41, 328.CrossRefGoogle Scholar
Goldberg, D.E. (1989). Genetic algorithms in search, optimization, and machine learning, New York: Addison-Wesley.Google Scholar
Grötschel, M., Holland, O. (1991). Solution of large-scale symmetric traveling salesman problems. Mathematical Programming, 51, 141202.CrossRefGoogle Scholar
Hair, J.F., Anderson, R.E., Tatham, R.L., Black, W.C. (1998). Multivariate data analysis, (5th ed.). Saddle River: Prentice-Hall.Google Scholar
Hand, D.J. (1981). Discrimination and classification, New York: Wiley.Google Scholar
Hand, D.J., Krzanowski, W.J. (2005). Optimising k-means clustering results with standard software packages. Computational Statistics and Data Analysis, 49, 969973.CrossRefGoogle Scholar
Hansen, P., Mladenoviĉ, N. (2001). J-Means: A new local search heuristic for minimum sum of squares clustering. Pattern Recognition, 34, 405413.CrossRefGoogle Scholar
Hartigan, J.A. (1975). Clustering algorithms, New York: Wiley.Google Scholar
Hartigan, J.A., Wong, M.A. (1979). Algorithm AS136: A k-means clustering program. Applied Statistics, 28, 100128.CrossRefGoogle Scholar
Heinz, G., Peterson, L.J., Johnson, R.W., & Kerk, C.J. (2003). Exploring relationships in body dimensions. Journal of Statistics Education, 11, www.amstat.org/publications/jse/v11n2/datasets.heinz.html.Google Scholar
Holland, J.H. (1975). Adaptation in natural and artificial systems, Ann Arbor: University of Michigan Press.Google Scholar
Hubert, L., Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193218.CrossRefGoogle Scholar
Hubert, L., Arabie, P., Meulman, J. (2001). Combinatorial data analysis: Optimization by dynamic programming, Philadelphia: Society Industrial and Applied Mathematics.CrossRefGoogle Scholar
Jancey, R.C. (1966). Multidimensional group analysis. Australian Journal of Botany, 14, 127130.CrossRefGoogle Scholar
Jones, D.R., Beltramo, M.A. (1991). Solving partitioning problems with genetic algorithms. In Belew, R.K., Booker, J.B. (Eds.), Proceedings of the fourth international conference on genetic algorithms, San Mateo: Morgan Kaufmann.Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P. (1983). Optimization by simulated annealing. Science, 220, 671680.CrossRefGoogle ScholarPubMed
Klein, R.W., Dubes, R.C. (1989). Experiments in projection and clustering by simulated annealing. Pattern Recognition, 22, 213220.CrossRefGoogle Scholar
Koontz, W.L.G., Narendra, P.M., Fukunaga, K. (1975). A branch and bound clustering algorithm. IEEE Transaction on Computers, C-24, 908915.CrossRefGoogle Scholar
Krishna, K., Murty, N.M. (1999). Genetic K-means algorithm. IEEE Transactions on Systems, Man, & Cybernetics—Part B: Cybernetics, 29, 433439.CrossRefGoogle ScholarPubMed
MacQueen, J.B. (1967). Some methods for classification and analysis of multivariate observations. In Le Cam, L.M. & Neyman, J. (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 281297). Berkeley: University of California Press.Google Scholar
Maronna, R., Jacovkis, P.M. (1974). Multivariate clustering procedures with variable metrics. Biometrics, 30, 499505.CrossRefGoogle Scholar
MathWorks, Inc. (2002). Using MATLAB (version 6), Natick: The MathWorks, Inc.Google Scholar
Maulik, U., Bandyopadhyay, S. (2000). Genetic algorithm-based clustering technique. Pattern Recognition, 33, 14551465.CrossRefGoogle Scholar
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A., Teller, E. (1953). Equation of state calculations by fast computing machines. Journal of Chemical Physics, 21, 10871092.CrossRefGoogle Scholar
Milligan, G.W. (1980). An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45, 325342.CrossRefGoogle Scholar
Milligan, G.W. (1980). The validation of four ultrametric clustering algorithms. Pattern Recognition, 12, 4150.CrossRefGoogle Scholar
Milligan, G.W. (1985). An algorithm for generating artificial test clusters. Psychometrika, 50, 123127.CrossRefGoogle Scholar
Milligan, G.W. (1989). A validation study of a variable-weighting algorithm for cluster analysis. Journal of Classification, 6, 5371.CrossRefGoogle Scholar
Milligan, G.W., Cooper, M.C. (1986). A study of the comparability of external criteria for hierarchical cluster analysis. Multivariate Behavioral Research, 21, 441458.CrossRefGoogle ScholarPubMed
Milligan, G.W., Cooper, M.C. (1988). A study of variable standardization. Journal of Classification, 5, 181204.CrossRefGoogle Scholar
Pacheco, J., Valencia, O. (2003). Design of hybrids for the minimum sum-of-squares clustering problem. Computational Statistics and Data Analysis, 43, 235248.CrossRefGoogle Scholar
Pal, S.K., Majumder, D.D. (1977). Fuzzy sets and decision making approaches in vowel and speaker recognition. IEEE Transactions on Systems, Man, and Cybernetics, 7, 625629.Google Scholar
Scott, A.J., Symons, M.J. (1971). Clustering methods based on likelihood ratio criteria. Biometrics, 27, 387398.CrossRefGoogle Scholar
Selim, S.Z., Al-Sultan, K. (1991). A simulated annealing algorithm for the clustering problem. Pattern Recognition, 24, 10031008.CrossRefGoogle Scholar
Späth, H. (1980). Cluster analysis algorithms for data reduction and classification of objects, New York: Wiley.Google Scholar
Steinley, D. (2003). Local optima in K-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294304.CrossRefGoogle ScholarPubMed
Steinley, D. (2004). Properties of the Hubert–Arabie adjusted Rand index. Psychological Methods, 9, 386396.CrossRefGoogle ScholarPubMed
Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 134.CrossRefGoogle ScholarPubMed
Steinley, D. (2006). Profiling local optima in K-means clustering: Developing a diagnostic technique. Psychological Methods, 11, 178192.CrossRefGoogle ScholarPubMed
Steinley, D., Brusco, M. (2007). Initializing K-means batch clustering: A critical analysis of several techniques. Journal of Classification, 24, 99121.CrossRefGoogle Scholar
Steinley, D., Henson, R. (2005). OCLUS: An algorithmic method for generating clusters with known overlap. Journal of Classification, 22, 221250.CrossRefGoogle Scholar
Sun, L.-X., Xie, Y.-L., Song, X.-H., Wang, J.-H., Yu, R.-Q. (1994). Cluster analysis by simulated annealing. Computers in Chemistry, 18, 103108.CrossRefGoogle Scholar
Sun, L.-X., Xu, F., Liang, Y.-Z., Xie, Y.-L., Yu, R.-Q. (1994). Cluster analysis by the K-means algorithm and simulated annealing. Chemometrics and Intelligent Laboratory Systems, 25, 5160.CrossRefGoogle Scholar
Sung, C.S., Jin, H.W. (2000). A tabu-search-based heuristic for clustering. Pattern Recognition, 33, 849858.CrossRefGoogle Scholar
van Os, B.J., Meulman, J.J. (2004). Improving dynamic programming strategies for partitioning. Journal of Classification, 21, 207230.CrossRefGoogle Scholar