Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-11T22:25:00.466Z Has data issue: false hasContentIssue false

Population and quantitative genomic properties of the USDA soybean germplasm collection

Published online by Cambridge University Press:  23 April 2018

Alencar Xavier
Affiliation:
Department of Agronomy, Purdue University, West Lafayette, IN 47907, USA Quantitative Genetics, Dow AgroSciences, Indianapolis, IN 46268, USA
Rima Thapa
Affiliation:
Department of Agronomy, Purdue University, West Lafayette, IN 47907, USA
William M. Muir
Affiliation:
Department of Animal Sciences, Purdue University, West Lafayette, IN 47907, USA
Katy Martin Rainey*
Affiliation:
Department of Agronomy, Purdue University, West Lafayette, IN 47907, USA
*
*Corresponding author. E-mail: [email protected]

Abstract

This study is the first assessment of the entire soybean [Glycine max (L.) Merr] collection of the United State Department of Agriculture National Plant Germplasm System (USDA) reporting quantitative and population genomic parameters. It also provides a new insight into soybean germplasm structure. Germplasm studies enable plant breeders to incorporate novel genetic resources into breeding pipelines to improve valuable agronomic traits. We conducted comprehensive analyses on the 19,652 soybean accessions in the USDA-ARS germplasm collection, genotyped with the SoySNP50 K iSelect BeadChip SNP array, to elucidate the quantitative properties of existing subpopulations inferred through hierarchical clustering performed with Ward's D agglomeration method and Nei's standard genetic distance. We found the effective population size to be approximately 106 individuals based on the linkage disequilibrium of unlinked loci. The cladogram indicated the existence of eight major clusters. Each cluster displays particular properties with regard to major quantitative traits. Among those, cluster 3 represents the tropical and semi-tropical genetic material, cluster 5 displays large seeds and may represent food-grade germplasm, and cluster 7 represents the undomesticated material in the germplasm collection. The average FST among clusters was 0.22 and a total of 914 SNPs were exclusive to specific clusters. Our classification and characterization of the germplasm collection into major clusters provides valuable information about the genetic resources available to soybean breeders and researchers.

Type
Research Article
Copyright
Copyright © NIAB 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Akey, JM, Zhang, G, Zhang, K, Jin, L and Shriver, MD (2002) Interrogating a high-density SNP map for signatures of natural selection. Genome Research 12: 18051814.Google Scholar
Arshad, MU, Ali, N and Ghafoor, A (2006) Character correlation and path coefficient in soybean Glycine max (L.) Merrill. Pakistan Journal of Botany 38: 121.Google Scholar
Bandillo, N, Jarquin, D, Song, Q, Nelson, R, Cregan, P, Specht, J and Lorenz, A (2015) A population structure and genome-wide association analysis on the USDA soybean germplasm collection. Plant Gene 8: 113. doi: 10.3835/plantgenome2015.04.0024.Google Scholar
Brown-Guedira, GL, Thomson, JA, Nelson, RL and Warburton, ML (2000) Evaluation of genetic diversity of soybean introductions and North American ancestors using RAPD and SSR markers. Crop Science 40: 815823.Google Scholar
Carter, TE, Hymowitz, T and Nelson, RL (2004a) Biogeography, local adaptation, Vavilov and genetic diversity in soybean. In: Werner, D (eds) Biological Resources and Migration. Berlin: Springer, pp. 4759.Google Scholar
Carter, TE, Nelson, R, Sneller, CH and Cui, Z (2004b) In soybeans: improvement, production, and uses. In: Boerma, HR and Specht, JE (eds) Vol Agronomy. Madison, WI: American Society of Agronomy, Crop Science Society of America, Soil Science Society of America, no 16, pp. 303416.Google Scholar
Chan, C, Qi, X, Li, M-W, Wong, F-L and Lam, H-M (2012) Recent developments of genomic research in soybean. Journal of Genetics and Genomics 39: 317324.Google Scholar
Chang, H, Lipka, AE, Domier, LL and Hartman, GL (2016) Characterization of disease resistance loci in the USDA soybean germplasm collection using genome-wide association studies. Plytopathology 106: 11391151.Google Scholar
Concibido, V, La Vallee, B, Mclaird, P, Pineda, N, Meyer, J, Hummel, L, Yang, J, Wu, K and Delannay, X (2003) Introgression of a quantitative trait locus for yield from Glycine soja into commercial soybean cultivars. Theoretical and Applied Genetics 106: 575582.Google Scholar
Cox, TF and Cox, MA (2000) Multidimensional Scaling. CRC Press.Google Scholar
DeJong, G and VanNoordwijk:, AJ (1992) Acquisition and allocation of resources: genetic (co) variances, selection, and life histories. American Naturalist 139: 749770.Google Scholar
Doebley, JF, Gaut, BS and Smith, BD (2006) The molecular genetics of crop domestication. Cell 127: 1309–1142.Google Scholar
Ecochard, R and Ravelomanantsoa, Y (1982) Genetic correlations derived from full-sib relationships in soybean (Glycine max Merr.). Theoretical and Applied Genetics 63: 915.Google Scholar
Ertl, DS and Fehr, WR (1985) Agronomic performance of soybean genotypes from Glycine max x Glycine soja crosses. Crop Science 25: 589592.Google Scholar
Flori, L, Fritz, S, Jaffrézic, F, Boussaha, M, Gut, I, Heath, S, Foulley, JL and Gautier, M (2009) The genome response to artificial selection: a case study in dairy cattle. PLoS ONE 4: e6595.Google Scholar
Grant, D, Nelson, RT, Cannon, SB and Shoemaker, RC (2009) Soybase, the USDA-ARS soybean genetics and genomics database. Nucleic Acids Research 38: D843D846.Google Scholar
Guo, J, Wang, Y, Song, C, Zhou, J, Qiu, L, Huang, H and Wang, Y (2010) A single origin and moderate bottleneck during domestication of soybean (Glycine max): implications from microsatellites and nucleotide sequences. Annals of Botany 106: 505514.Google Scholar
Ha, BK, Lee, KJ, Velusamy, V, Kim, JB, Kim, SH, Ahn, JW, Kang, SY and Kim, DS (2014) Improvement of soybean through radiation-induced mutation breeding techniques in Korea. Plant Genetic Resources 12: S54S57.Google Scholar
Hazel, LN (1943) The genetic basis for constructing selection indexes. Genetics 28: 476490.Google Scholar
He, S, Wang, Y, Volis, S, Li, D and Yi, T (2012) Genetic diversity and population structure: implications for conservation of wild soybean (Glycine soja Sieb. et Zucc) based on nuclear and chloroplast microsatellite variation. International Journal of Molecular Sciences 13: 1260812628.Google Scholar
Henryon, M, Berg, P and Sørensen, AC (2014) Animal-breeding schemes using genomic information need breeding plans designed to maximise long-term genetic gains. Livestock Science 166: 3847.Google Scholar
Holsinger, KE and Weir, BS (2009) Genetics in geographically structured populations: defining, estimating and interpreting FST. Nature Reviews Genetics 10: 639650.Google Scholar
Hou, A, Chen, P, Alloatti, J, Li, D, Mozzoni, L, Zhang, B and Shi, A (2009) Genetic variability of seed sugar content in worldwide soybean germplasm collections. Crop Science 49: 903912.Google Scholar
Hymowitz, T (2008) The history of the soybean. In Johnson, L, White, PJ and Galloway, R (eds) Soybeans: Chemistry, Production, Processing and Utilization. Urbana, IL: AOCS Press, pp. 132.Google Scholar
Hyten, DL, Song, Q, Zhu, Y, Choi, I, Nelson, RL, Costa, JM, Specht, JE, Shoemaker, RC and Cregan, PB (2006) Impacts of genetic bottlenecks on soybean genome diversity. Proceedings of the National Academy of Sciences of the United States opf America 103: 1666616671.Google Scholar
James, G, Witten, D, Hastie, T and Tibshirani, R (2013) An Introduction to Statistical Learning. New York: Springer, 1st ed. 2013, Corr. 5th printing 2015 Edition.Google Scholar
Jarquin, D, Specht, J and Lorenz, A (2016) Prospects of genomic prediction in the USDA soybean germplasm collection: historical data creates robust models for enhancing selection of accessions. G3: Genes| Genomes| Genetics 6: 23292341.Google Scholar
Johnson, HW, Robinson, HF and Comstock, RE (1955) Estimates of genetic and environmental variability in soybeans. Agronomy Journal 47: 314318.Google Scholar
Jombart, T and Ahmed, I (2011) Adegenet 1.3–1: new tools for the analysis of genome-wide SNP data. Bioinformatics 27: 30703071.Google Scholar
Kuroda, Y, Kaga, A, Tomooka, N, Yano, H, Takada, Y, Kato, S and Vaughan, D (2013) QTL affecting fitness of hybrids between wild and cultivated soybeans in experimental fields. Ecology and Evolution 3: 21502168. http://doi.org/10.1002/ece3.606.Google Scholar
Kwon, SH and Torrie, JH (1964) Heritability and interrelationship among traits of two soybean populations. Crop Science 4: 196198.Google Scholar
Lachance, J and Tishkoff, SA (2013) SNP ascertainment bias in population genetic analyses: why it is important, and how to correct it. Bioessays 35: 780786.Google Scholar
Li, YH, Li, W, Zhang, C, Yang, L, Chang, RZ, Gaut, BS and Qiu, LJ (2010) Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and single-nucleotide polymorphism loci. New Phytologist 188: 242253.Google Scholar
Li, YH, Zhao, SC, Ma, JX, Li, D, Yan, L, Li, J, Qi, XT, Guo, XS, Zhang, L, He, WM and Chang, RZ (2013) Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing. BMC Genomics 14: 579.Google Scholar
Mardia, KV (1978) Some properties of classical multidimensional scaling. Communications on Statistics – Theory and Methods A7: 12331241.Google Scholar
Min, W, Run-zhi, L, Wan-ming, Y and Wei-jun, D (2013) Assessing the genetic diversity of cultivars and wild soybeans using SSR markers. African Journal of Biotechnology 9: 48574866.Google Scholar
Misztal, I, Tsuruta, S, Strabel, T, Auvray, B, Druet, T and Lee, DH (2002) BLUPF90 and related programs (BGF90). In Proceedings of the 7th World Congress on Genetics Applied to Livestock Production, Montpellier, France, August 2002; Session 28. (pp. 1–2). Institut National de la Recherche Agronomique (INRA).Google Scholar
Molnar, SJ, Rai, S, Charette, M and Cober, ER (2003) Simple sequence repeat (SSR) markers linked to E1, E3, E4, and E7 maturity genes in soybean. Genome 46: 10241036.Google Scholar
Muir, WM (2007) Comparison of genomic and traditional BLUP-estimated breeding value accuracy and selection response under alternative trait and genomic parameters. Journal of Animal Breeding and Genetics 124: 342355.Google Scholar
Murtagh, F and Legendre, P (2014) Ward's hierarchical agglomerative clustering method: which algorithms implement ward's criterion? Journal of Classification 31: 274295.Google Scholar
Narvel, JM, Fehr, WR, Chu, WC, Grant, D and Shoemaker, RC (2000) Simple sequence repeat diversity among soybean plant introductions and elite genotypes. Crop Science 40: 14521458.Google Scholar
Nei, M (1972) Genetic distance between populations. American Naturalist 106: 283292.Google Scholar
Paradis, E, Claude, J and Strimmer, K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289290.Google Scholar
Recker, JR, Burton, JW, Cardinal, A and Miranda, L (2013) Analysis of quantitative traits in two long-term randomly mated soybean populations: I. Genetic Variances 53: 13751383.Google Scholar
Recker, JR, Burton, JW, Cardinal, A and Miranda, L (2014) Genetic and phenotypic correlations of quantitative traits in two long-term, randomly mated soybean populations. Crop Science 54: 939943.Google Scholar
Reif, JC, Melchinger, AE and Frisch, M (2005) Genetical and mathematical properties of similarity and dissimilarity coefficients applied in plant breeding and seed bank management. Crop Science 45: 17.Google Scholar
Samanfar, B, Molnar, SJ, Charette, M, Schoenrock, A, Dehne, F, Golshani, A, Belzile, F and Cober, ER (2017) Mapping and identification of a potential candidate gene for a novel maturity locus, E10, in soybean. Theoretical and Applied Genetics 130: 377390.Google Scholar
Schmutz, J, Cannon, SB, Schlueter, J, Ma, J, Mitros, T, Nelson, W, Hyten, DL, Song, Q, Thelen, JJ, Cheng, J and Xu, D, (2010) Genome sequence of the palaeopolyploid soybean. Nature 463: 178183.Google Scholar
Searle, SR (1961) Phenotypic, genetic and environmental correlations. Biometrics 17: 474480.Google Scholar
Sherman-Broyles, S, Bombarely, A, Powell, AF, Doyle, JL, Egan, AN, Coate, JE and Doyle, JJ (2014) The wild side of a major crop: soybean's perennial cousins from down under. American Journal of Botany 101: 16511665.Google Scholar
Shi, A, Chen, P, Zhang, B and Hou, A (2010) Genetic diversity and association analysis of protein and oil content in food-grade soybeans from Asia and the United States. Plant Breeding 129: 250256.Google Scholar
Shoemaker, RC, Schlueter, J and Doyle, JJ (2006) Paleopolyploidy and gene duplication in soybean and other legumes. Current Opinion in Plant Biology 9: 104109.Google Scholar
Singh, RJ and Hymowitz, T (1989) The genomic relationships among Glycine soja Sieb. and Zucc. G. max (L.) Merr. and ‘G. gracilis’ Skvortz. Plant Breeding 103: 171173.Google Scholar
Singh, RJ and Nelson, RL (2015) Intersubgeneric hybridization between Glycine max and G. tomentella: production of F1, amphidiploid, BC1, BC2, BC3, and fertile soybean plants. Theoretical and Applied Genetics 128: 11171136.Google Scholar
Slatkin, M and Excoffier, L (1996) Maximization algorithm. Heredity 76: 377383.Google Scholar
Song, Q, Hyten, DL, Jia, G, Quigley, CV, Fickus, EW, Nelson, RL and Cregan, PB (2013) PB. Development and evaluation of SoySNP50K, a high-density genotyping array for soybean. PLoS ONE 8: e54985.Google Scholar
Song, Q, Hyten, DL, Jia, G, Quigley, CV, Fickus, EW, Nelson, RL and Cregan, PB (2015) Fingerprinting soybean germplasm and its utility in genomic research. G3: Genes| Genomes| Genetics 5: 19992006.Google Scholar
Stekhoven, DJ and Buhlmann, P (2012) Missforest: non-parametric missing value imputation for mixed-type data. Bioinformatics 28: 112118.Google Scholar
Stranden, I and Christensen, OF (2011) Allele coding in genomic evaluation. Genetics Selection Evolution 43: 111.Google Scholar
Sved, JA, Cameron, EC and Gilchrist, AS (2013) Estimating effective population size from linkage disequilibrium between unlinked loci: theory and application to fruit fly outbreak populations. PLoS ONE 8: e69078.Google Scholar
Tasma, IM, Lorenzen, LL, Green, DE and Shoemaker, RC (2001) Mapping genetic loci for flowering time, maturity, and photoperiod insensitivity in soybean. Molecular Breeding 8: 2535.Google Scholar
Tavaud-Pirra, M, Sartre, P, Nelson, R, Santon, S, Texier, N and Roumet, P (2009) Genetic diversity in a soybean collection. Crop Science 49: 895902.Google Scholar
Wang, D, Graef, GL, Procopiuk, AM and Diers, BW (2004) Identification of putative QTL that underlie yield in interspecific soybean backcross populations. Theoretical and Applied Genetics 108: 458467.Google Scholar
Wang, KJ, Li, XH, Zhang, JJ, Chen, H, Zhang, ZL and Yu, GD (2010) Natural introgression from cultivated soybean (Glycine max) into wild soybean (Glycine soja) with the implications for origin of populations of semi-wild type and for biosafety of wild species in China. Genetic Resources and Crop Evolution 57: 747761.Google Scholar
Wang, Y, Lu, J, Chen, S, Shu, L, Palmer, RG, Xing, G, Li, Y, Yang, S, Yu, D, Zhao, T and Gai, J, (2014) Exploration of presence/absence variation and corresponding polymorphic markers in soybean genome. Journal of Integrative Plant Biology 56: 10091019.Google Scholar
Waples, RS, Antao, T and Luikart, G (2014) Effects of overlapping generations on linkage disequilibrium estimates of effective population size. Genetics 197: 769780.Google Scholar
Weir, BS and Cockerham, CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38: 13581370.Google Scholar
Wen, Z, Ding, Y, Zhao, T and Gai, J (2009) Genetic diversity and peculiarity of annual wild soybean (G. soja Sieb. et Zucc.) from various eco-regions in China. Theoretical and Applied Genetics 119: 371381.Google Scholar
Wright, S (1949) The genetical structure of populations. Annals of Eugenics 15: 323354.Google Scholar
Wright, S (1965) The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution 19: 395420.Google Scholar
Xavier, A, Xu, S, Muir, WM and Rainey, KM (2015) NAM: association studies in multiple populations. Bioinformatics 31: 38623864.Google Scholar
Xavier, A, Muir, WM, Craig, B and Rainey, KM (2016) Walking through the statistical black boxes of plant breeding. Theoretical and Applied Genetics 129: 19331949.Google Scholar
Xavier, A, Hall, B, Casteel, S, Muir, W and Rainey, KM (2017) Using unsupervised learning techniques to assess interactions among complex traits in soybeans. Euphytica 213: 200.Google Scholar
Xavier, A, Jarquin, D, Howard, R, Ramasubramanian, V, Specht, JE, Graef, GL, Beavis, WD, Diers, BW, Song, Q, Cregan, PB and Nelson, R (2018) Genome-Wide analysis of grain yield stability and environmental interactions in a multiparental soybean population. G3: Genes, Genomes, Genetics 8: 519529.Google Scholar
Xu, D, Abe, J, Gai, J and Shimamoto, Y (2002) Diversity of chloroplast DNA SSRs in wild and cultivated soybeans: evidence for multiple origins of cultivated soybean. Theoretical and Applied Genetics 105: 645653.Google Scholar
Xu, M, Xu, Z, Liu, B, Kong, F, Tsubokura, Y, Watanabe, S, Xia, Z, Harada, K, Kanazawa, A, Yamada, T and Abe, J (2013) Genetic variation in four maturity genes affects photoperiod insensitivity and PHYA-regulated post-flowering responses of soybean. BMC Plant Biology 13: 1.Google Scholar
Yamada, T, Takagi, K and Ishimoto, M (2012) Recent advances in soybean transformation and their application to molecular breeding and genomic analysis. Breeding Science 61: 480494.Google Scholar
Yamamichi, M and Innan, H (2012) Estimating the migration rate from genetic variation data. Heredity 108: 362.Google Scholar
Zera, AJ and Harshman:, LG (2001) The physiology of life history trade-offs in animals. Annual Review of Ecology and Systematics 32: 95126.Google Scholar
Zhang, J, Song, Q, Cregan, PB and Jiang, GL (2016) Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max). Theoretical and Applied Genetics 129: 117130.Google Scholar
Zhao, S, Zheng, F, He, W, Wu, H, Pan, S and Lam, HM (2015) Impacts of nucleotide fixation during soybean domestication and improvement. BMC Plant Biology 15: 81.Google Scholar
Zhou, Z, Jiang, Y, Wang, Z, Gou, Z, Lyu, J, Li, W, Yu, Y, Shu, L, Zhao, Y, Ma, Y and Fang, C (2015) Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nature Biotechnology 33: 408414.Google Scholar
Zhu, YL, Song, QJ, Hyten, DL, Van Tassell, CP, Matukumalli, LK, Grimm, DR, et al. (2003) Single-nucleotide polymorphisms in soybean. Genetics 163: 11231134.Google Scholar
Supplementary material: PDF

Xavier et al. supplementary material

Xavier et al. supplementary material 1

Download Xavier et al. supplementary material(PDF)
PDF 1.7 MB
Supplementary material: File

Xavier et al. supplementary material

Xavier et al. supplementary material 2

Download Xavier et al. supplementary material(File)
File 1.3 MB