Hostname: page-component-78c5997874-dh8gc Total loading time: 0 Render date: 2024-11-05T04:05:04.928Z Has data issue: false hasContentIssue false

A comparative performance of clustering procedures for mixture of qualitative and quantitative data – an application to black gram

Published online by Cambridge University Press:  25 July 2011

Rupam Kumar Sarkar
Affiliation:
Indian Agricultural Statistics Research Institute, New Delhi110 012, India
A. R. Rao*
Affiliation:
Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute, New Delhi110 012, India
S. D. Wahi
Affiliation:
Biometrics Division, Indian Agricultural Statistics Research Institute, New Delhi110 012, India
K. V. Bhat
Affiliation:
National Bureau of Plant Genetic Resources, New Delhi110 012, India
*
*Corresponding author. E-mail: [email protected]

Abstract

Knowledge of the genetic diversity of germplasm of breeding material is invaluable in crop improvement programmes. Frequently, qualitative and quantitative data are used separately to assess genetic diversity of crop genotypes. While assessing diversity based on qualitative and quantitative traits separately, there may occur a problem when the degree of correspondence between the clusters formed does not agree with each other. This study compares five different procedures of clustering based on the criterion of weighted average of observed proportion of misclassification in black gram genotypes using qualitative, quantitative traits and mixture data. The INDOMIX- and PRINQUAL-based clustering procedures, i.e. INDOMIX and PRINQUAL methods in conjunction with the k-means clustering procedure, show better performance compared with other clustering procedures, followed by clustering based on either quantitative or qualitative data alone. The use of the INDOMIX- and PRINQUAL-based procedures can help breeders in capturing the variation present in both qualitative and quantitative trait data simultaneously and solving the problem of ambiguity over the degree of correspondence between clustering based on either qualitative or quantitative traits alone.

Type
Research Article
Copyright
Copyright © NIAB 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Carrol, JD and Chang, JJ (1970) Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart–Young decomposition. Psychometrika 35: 283319.CrossRefGoogle Scholar
Cole-Rodgers, P, Smith, DW and Bosland, PW (1997) A novel statistical approach to analyze genetic resource evaluations using capsicum as an example. Crop Science 37: 10001002.CrossRefGoogle Scholar
de Leeuw, J and van Rijckevorsel, JLA (1980) HOMALS and PRINCALS, some generalization of principal components analysis. In: Diday, E, Lebart, L, Pagès, JP and Tomassone, R (eds) Data Analysis and Informatics II. North Holland/Amsterdam: Elsevier Science Publisher, pp. 231242.Google Scholar
Dempster, AP, Laird, NM and Rubin, DB (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society 39: 138.Google Scholar
Geleta, N and Labuschange, MT (2005) Qualitative traits variation in sorghum (Sorghum bicolor (L.) Moench) germplasm from eastern highlands of Ethiopia. Biodiversity and Conservation 14: 30553064.CrossRefGoogle Scholar
Gower, JC (1971) A general coefficient of similarity and some of its properties. Biometrics 27: 857872.CrossRefGoogle Scholar
Harch, BD, Basford, KE, DeLacy, IH and Lawrence, PK (1999) The analysis of large scale data taken from the world groundnut (Arachis hypogaea L.) germplasm collection. II. Two-way data with mixed data types. Euphytica 105: 7382.CrossRefGoogle Scholar
Kawuki, RS, Ferguson, M, Labuschagne, MT, Herselman, L, Orone, J, Ralimanana, I, Bidiaka, M, Lukombo, S, Kanyange, MC, Gashaka, G, Mkamilo, G, Gethi, J and Obiero, H (2011) Variation in qualitative and quantitative traits of cassava germplasm from selected national breeding programmes in sub-Saharan Africa. Field Crops Research 122: 151156.CrossRefGoogle Scholar
Kiers, HAL (1989) Three-way Methods for Analysis of Qualitative and Quantitative Two-way Data. Leiden: DSWO Press.Google Scholar
Kohonen, T (1988) Self-organizing and Associative Memory. 3rd edn. New York: Springer-Verlag, Inc.CrossRefGoogle Scholar
Kolluru, R, Rao, AR, Prabhakaran, VT, Selvi, A and Mohapatra, T (2007) Comparative evaluation of clustering techniques for establishing AFLP based genetic relationship among sugarcane cultivars. Journal of Indian Society of Agricultural Statistics 61: 5165.Google Scholar
Li, T (2006) A unified view on clustering binary data. Machine Learning 62: 199215.CrossRefGoogle Scholar
Mohammadi, SA and Prasanna, BM (2003) Analysis of genetic diversity in crop plants – salient statistical tools and considerations. Crop Science 43: 12351248.CrossRefGoogle Scholar
Peeters, JP and Martinelli, JA (1989) Hierarchical cluster analysis as a tool to manage variation in germplasm collections. Theoretical and Applied Genetics 78: 4248.CrossRefGoogle ScholarPubMed
SAS (2005) SAS® 9.1.3 Language Reference: Concepts. 3rd edn. Cary, NC: SAS Institute, Inc.Google Scholar
Sneath, PHA and Sokal, RR (1973) Numerical Taxonomy. San Francisco, CA: Freeman.Google Scholar
Souza, E and Sorrells, ME (1991a) Relationships among 70 North American oat germplasms. I. Cluster analysis using quantitative characters. Crop. Science 31: 599605.Google Scholar
Souza, E and Sorrells, ME (1991b) Relationships among 70 North American oat germplasms. I. Cluster analysis using qualitative characters. Crop Science 31: 605612.CrossRefGoogle Scholar
Ward, JH Jr (1963) Hierarchical grouping to optimize an objective function. Journal of American Statistical Association 58: 236244.CrossRefGoogle Scholar
Winsberg, S and Ramsay, JO (1983) Monotone spline transformations for dimension reduction. Psychometrika 48: 575595.CrossRefGoogle Scholar
Supplementary material: File

Rao Supplementary Material 1

Rao Supplementary Material 1

Download Rao Supplementary Material 1(File)
File 91.6 KB
Supplementary material: File

Rao Supplementary Material 2

Rao Supplementary Material 2

Download Rao Supplementary Material 2(File)
File 53.8 KB
Supplementary material: File

Rao Supplementary Data 1

Rao Supplementary Data 1

Download Rao Supplementary Data 1(File)
File 925 Bytes
Supplementary material: File

Rao Supplementary Data 2

Rao Supplementary Data 2

Download Rao Supplementary Data 2(File)
File 6 KB
Supplementary material: File

Rao Supplementary Data 3

Rao Supplementary Data 3

Download Rao Supplementary Data 3(File)
File 6.9 KB