Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers

Carlos L. Acuña-Matamoros; M. Humberto Reyes-Valdés

doi:10.1017/S1479262117000247

Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers

Published online by Cambridge University Press: 25 September 2017

Carlos L. Acuña-Matamoros and

M. Humberto Reyes-Valdés

Show author details

Carlos L. Acuña-Matamoros: Affiliation:
Departamento de Fitomejoramiento, Universidad Autónoma Agraria Antonio Narro, Buenavista, 25315, Saltillo, Coah., Mexico
M. Humberto Reyes-Valdés*: Affiliation:
Departamento de Fitomejoramiento, Universidad Autónoma Agraria Antonio Narro, Buenavista, 25315, Saltillo, Coah., Mexico
*: *Corresponding author. E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.

Keywords

allele richness diversity genetic distance seed banks

Type: Research Article
Information: Plant Genetic Resources , Volume 16 , Issue 3 , June 2018 , pp. 228 - 236

DOI: https://doi.org/10.1017/S1479262117000247 [Opens in a new window]
Copyright: Copyright © NIAB 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

De Beukelaer, HD, Smýkal, P, Davenport, GF and Fack, V (2012) Core Hunter II: fast core subset selection based on multiple genetic diversity measures using Mixed Replica search. BMC Bioinformatics 13: 312.Google Scholar

Franco, J, Crossa, J, Villaseñor, J, Taba, S and Eberhart, SA (1998) Classifying genetic resources by categorical and continuous variables. Crop Science 38: 1688–1696.Google Scholar

Franco, J, Crossa, J, Taba, S and Shands, H (2005) A sampling strategy for conserving genetic diversity when forming core subsets using genetic markers. Crop Science 46: 854–864.Google Scholar

Frankel, OH and Brown, AHD (1984) Plant genetic resources today: a critical appraisal. In Holden, JHW and Williams, JT (eds) Crop Genetic Resources: Conservation and Evaluation. London: George Allen and Unwin, pp. 249–257.Google Scholar

Geyer, CJ (1991) Markov chain Monte Carlo maximum likelihood. In Keramidas, (ed.) Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface. Interface Foundation: Fairfax Station, pp. 156–163.Google Scholar

Goodman, MM and Stuber, CW (1983) Races of maize: vI. Isozyme variation among races of maize in Bolivia. Maydica 28: 169–187.Google Scholar

Gouesnard, B, Bataillon, TM, Decoux, G, Rozale, C, Schoen, DJ and David, JL (2001) MSTRAT: an algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness. The Journal of Heredity 92: 93–94.Google Scholar

Govindaraj, M, Vetriventhan, M and Srinivasan, M (2015) Importance of genetic diversity assessment in crop plants and its recent advances: an overview of its analytical perspectives. Genetics Research International 2015: 14.Google Scholar

Iba, Y (2001) Extended ensemble monte carlo. International Journal of Modern Physics C 12: 623–656.Google Scholar

Kim, KW, Chung, HK, Cho, GT, Ma, KH, Chandrabalan, D, Gwag, JG, Kim, TS, Cho, EG and Pak, YJ (2007) Powercore: a program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics 23: 2155–2162.Google Scholar

Kimura, K and Taki, K (1991) Time-homogeneous parallel annealing algorithm. In Vichneetsky, R and Miller, JJH (eds.) Proceedings of the 13th IMACS World Congress on Computation and Applied Mathematics (IMACS'91), vol. 2. Dublin, Ireland: International Association for Mathematics and Computer Simulation, pp. 827–828.Google Scholar

R Core Team (2016) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at https://www.R-project.org/ (Accessed January 2016).Google Scholar

Reyes-Valdes, MH (2013) Informativeness of microsatellite markers. In: Kantartzi, SK (ed.) Microsatellites. Methods in molecular biology (Methods and Protocols), vol. 1006. Totowa NJ, USA: Humana Press, pp. 257–270.Google Scholar

Schoen, DJ and Brown, AHD (1993) Conservation of allelic richness in wild crop relatives is aided by assessment of genetic markers. Proceedings of the National Academy of Sciences of the United States of America 90: 10623–10627.Google Scholar

Shannon, CE (1948) A mathematical theory of communication. The Bell System Technical Journal 27: 623–656.Google Scholar

Singh, S, Sansaloni, C, Petroli, C, Ellis, M and Kilian, A (2014) DArTseq-derived SNPs for wheat Mexican landrace accessions International Maize and Wheat Improvement Center (CIMMYT). Available at http://hdl.handle.net/11529/10013 (Accessed September 2015).Google Scholar

Thachuk, C, Crossa, J, Franco, J, Dreisigacker, S, Warburton, M and Davenport, GF (2009) Core Hunter: an algorithm for sampling genetic resources based on multiple genetic measures. BMC Bioinformatics 10: 243.Google Scholar

Vikram, P, Franco, J, Burgueño-Ferreira, J, Li, H, Sehgal, D, Saint Pierre, C, Ortiz, C, Sneller, C, Tattaris, M, Guzman, C, Sansaloni, CP, Ellis, M, Fuentes-Davila, G, Reynolds, M, Sonder, K, Singh, P, Payne, T, Wenzl, P, Sharma, A, Bains, NS, Singh, GP, Crossa, J and Singh, S (2016) Unlocking the genetic diversity of Creole wheats. Scientific Reports 6: 23092.Google Scholar

Acuña-Matamoros and Reyes-Valdés supplementary material

Tables S1-S2

PDF 92.1 KB

Article contents

Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Acuña-Matamoros and Reyes-Valdés supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests