Subset selection of markers for the genome-enabled prediction of genetic values using radial basis function neural networks

  • Isabela de Castro Sant'Anna Universidade Federal Viçosa
  • Gabi Nunes Silva Universidade Federal de Rondônia
  • Moysés Nascimento Universidade Federal Viçosa
  • Cosme Damião Cruz Universidade Federal Viçosa
Keywords: neural networks; genomic prediction; stepwise regression.

Abstract

This paper aimed to evaluate the effectiveness of subset selection of markers for genome-enabled prediction of genetic values using radial basis function neural networks (RBFNN). To this end, an F1 population derived from the hybridization of divergent parents with 500 individuals genotyped with 1000 SNP-type markers was simulated. Phenotypic traits were determined by adopting three different gene action models – additive, additive-dominant, and epistatic, representing two dominance situations: partial and complete with quantitative traits having a heritability (h2) of 30 and 60%; traits were controlled by 50 loci, considering two alleles per locus. Twelve different scenarios were represented in the simulation. The stepwise regression was used before the prediction methods. The reliability and the root mean square error were used for estimation using a fivefold cross-validation scheme. Overall, dimensionality reduction improved the reliability values for all scenarios, specifically with h2 =30 the reliability value from 0.03 to 0.59 using RBFNN and from 0.10 to 0.57 with RR-BLUP in the scenario with additive effects. In the additive dominant scenario, the reliability values changed from 0.12 to 0.59 using RBFNN and from 0.12 to 0.58 with RR-BLUP, and in the epistasis scenarios, the reliability values changed from 0.07 to 0.50 using RBFNN and from 0.06 to 0.47 with RR-BLUP. The results showed that the use of stepwise regression before the use of these techniques led to an improvement in the accuracy of prediction of the genetic value and, mainly, to a large reduction of the root mean square error in addition to facilitating processing and analysis time due to a reduction in dimensionality.

Downloads

Download data is not yet available.

References

Akidemir, D., Jannink, J. L., & Isidro-Sánchez, J. (2017). Locally epistatic models for genome-wide prediction and association by importance sampling. Genetics Selection Evolution, 49(1), 49-74. DOI: 10.1186/S1271101703488

Almeida-Filho, J. E., Guimarães, J. F. R., Silva, F. F., de Resende, M. D. V., Muñoz, P., Kirst, M., & Resende Jr., M. F. R. (2016). The contribution of dominance to phenotype prediction in a pine breeding and simulated population. Heredity, 117(1), 33-41. DOI: 10.1111/1468-0009.12357

Azevedo, C. F., de Resende, M. D. V., Fonseca, F., Lopes, P. S., & Guimarães, S. E. F. (2013). Regressão via componentes independentes aplicada à seleção genômica para características de carcaça em suínos. Pesquisa Agropecuária Brasileira, 48(6), 619-626. DOI: 10.1590/S0100-204X2013000600007

Azevedo, C. F., Silva, F. F., de Resende, M. D. V., Lopes, M. S., Duijvesteijn, N., Guimarães, S. E. F., ... Knol, E. F. (2014). Supervised independent component analysis as an alternative method for genomic selection in pigs. Journal of Animal Breeding and Genetics, 131(6), 452-461. DOI: 10.1111jbg12104

Braga, A.P., Carvalho, A. P. L. F., & Ludermir, T. B. (2011). Redes neurais artificiais - teoria e aplicações (2a. ed.). Rio de Janeiro, RJ: LTV.

Crossa, J., Pérez-Rodríguez, P., Cuevas, J., Montesinos-López, O., Jarquín, D., de los Campos, G., ... Dreisigacker, S.(2017). Genomic selection in plant breeding: Methods, models, and perspectives. Trends in Plant Science, 22(11), 961-975. DOI: 10.1016/j.tplants.2017.08.011

Chen, S., Cowan, C. F., & Grant, P. M. (1991). Orthogonal least squares learning algorithm for radial basis function networks. IEEE Transactions on Neural Networks, 2(2), 302-309. DOI: 10.11097280341

Cruz, C. D., & Nascimento, M. (2018). Inteligência computacional aplicada ao melhoramento genético. Vicosa, MG: Editora UFV.

Cruz, C. D. (2016) Genes Software-extended and integrated with the R, Matlab and Selegen. Acta Scientiarum. Agronomy, 38(4), 547-552. DOI: 10.4025/actasciagron.v38i4.32629

Denis, M., & Bouvet, J. M. (2011). Genomic selection in tree breeding: testing accuracy of prediction models including dominance effect. BMC Proceedings, 5(7), 1-2. DOI: 10.1186/175365615S7O13

Dudley, J. W. (2008). Epistatic interactions in crosses of Illinois high oil 9 Illinois low oil and of Illinois high protein 9 Illinois low protein. Crop Science. 48, 59-68. DOI: 10.2135/cropsci2007.04.0242

Dudley, J. W., & Johnson, G. R. (2009). Epistatic models improve prediction of performance in corn. Crop Science, 49(3), 763-770. DOI: 10.2135/cropsci2008.08.0491

Felipe, V. P., Okut, H., Gianola, D., Silva, M. A., & Rosa, G. J. (2014). Effect of genotype imputation on genome-enabled prediction of complex traits: an empirical study with mice data. BMC Genetics, 15(1), 1-10. DOI: 10.1186/s12863-014-0149-9

Gianola, D., Fernando, R. L., & Stella, A. (2006). Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics, 173(3), 1761-1776. DOI: 101534genetics105049510

Gianola, D., Okut, H., Weigel, K. A., & Rosa, G. J. (2011). Predicting complex quantitative traits with Bayesian neural networks: a case study with Jersey cows and wheat. BMC Genetics, 12(1), 1-14. DOI: 10.1186/1471-2156-12-87

González-Camacho, J. M., de Los Campos, G., Pérez, P., Gianola, D., Cairns, J. E., Mahuku, G., ... Crossa, J. (2012). Genome-enabled prediction of genetic values using radial basis function neural networks. Theoretical and Applied Genetics, 125(4):759-771. DOI: 10.1007s0012201218689

Holland, J.B. (2006). Theoretical and biological foundations of plant breeding. In K. R. Lamkey, & M. Lee (Ed.), Plant breeding: the Arnel R Hallauer International Symposium (p. 127-140). Ames, IA: Blackwell Publishing. DOI: 10.1002/9780470752708.ch9

Howard, R., Carriquiry, A. L., & Beavis, W. D. (2014). Parametric and nonparametric statistical methods for genomic selection of traits with additive and epistatic genetic architectures. G3: Genes, Genomes, Genetics, 4(6), 1027-1046. DOI: 101534g3114010298

Lee, S. H., van der Werf, J. H., Hayes, B. J., Goddard, M. E., & Visscher, P. M. (2008). Predicting unobserved phenotypes for complex traits from whole-genome SNP data. PLoS Genetics, 4(10), 1-11. DOI: 10.1371journalpgen1000231

Long, N., Gianola, D., Rosa, G. J., & Weigel, K. A. (2011a). Marker-assisted prediction of non-additive genetic values. Genetica, 139(7), 843-854. DOI: 10.1007s1070901195887

Long, N., Gianola, D., Rosa, G. J. M., & Weigel, K. A. (2011b). Dimension reduction and variable selection for genomic selection: application to predicting milk yield in Holsteins. Journal of Animal Breeding and Genetics, 128(4), 247-257. DOI: 10.1111j14390388201100917x

Long, N., Gianola, D., Rosa, G. J., Weigel, K. A., Kranis, A., & Gonzalez-Recio, O. (2010). Radial basis function regression methods for predicting quantitative traits using SNP markers. Genetics Research, 92(3), 209-225. DOI: 10.1017S0016672310000157

Long, N., Gianola, D., Rosa, G. J., Weigel, K. A., & Avendano, S. (2007). Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers. Journal of Animal Breeding and Genetics, 124(6), 377-389. DOI: 101159000317279

Mackay, T. F., Stone, E. A., & Ayroles, J. F. (2009). The genetics of quantitative traits: challenges and prospects. Nature Reviews Genetics, 10(8), 565. DOI: 101111j14390388200700694x

MATLAB. (2010). Matlab Version 7.10.0. Natick, MA: The Math Works Inc.

Meuwissen T. H. E., Hayes, B. J., & Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157(4), 1819-1829.

Montgomery, D. C., Peck, E. A., & Vining, G. G. (1982). Introduction to linear regression analysis. New York, US: John Wiley and Sons.

Pérez-Rodríguez, P., Gianola, D., González-Camacho, J. M., Crossa, J., Manès, Y., & Dreisigacker, S. (2012). Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat. G3: Genes, Genomes, Genetics, 2(12), 1595-1605. DOI: 101534g3112003665

R Core Team. (2018). R: A language and environment for statistical computing. Vienna, AU: R Foundation for Statistical Computing. Retrieved on Dec. 20, 2018 from https://www.R-project.org.

Santos, V. S., Martins Filho, S., Resende, M. D. V., Azevedo, C. F., Lopes, P. S., Guimarães, S. E. F., & Silva, F. F. (2016). Genomic prediction for additive and dominance effects of censored traits in pigs. Genetics and Molecular Research, 15(4), 1-16. DOI: 10.4238/gmr15048764

Sant'Anna, I. C., Nascimento, M., Silva, G. N., Cruz, C. D., Azevedo, C. F., Silva, F. F., & Gloria, L. S. (2019). Genome-enabled prediction of genetic values for using radial basis function neural networks. Functional Plant Breeding Journal, 1, 29-40. DOI:10.35418/2526-4117/v1n2a1

Viana, J. M. S., & Piepho, H. P. (2017). Quantitative genetics theory for genomic selection and efficiency of genotypic value prediction in open-pollinated populations. Scientia Agricola, 74(1), 41-50. DOI: 10.1590/0103-9016-2014-0383

Weigel, K. A., de Los Campos, G., Vazquez, A. I., Rosa, G. J. M., Gianola, D., & Van Tassell, C. P. (2010a). Accuracy of direct genomic values derived from imputed single nucleotide polymorphism genotypes in Jersey cattle. Journal of Dairy Science, 93(11), 5423-5435. DOI: 103168jds20103149

Weigel, K. A., Van Tassell, C. P., O’Connell, J. R., VanRaden, P. M., & Wiggans, G. R. (2010b). Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms. Journal of Dairy Science, 93(5), 2229-2238. DOI: 10.3168jds20092849

Xu, Y., Wang, X., Ding, X., Zheng, X., Yang, Z., Xu, C., & Hu, Z. (2018). Genomic selection of agronomic traits in hybrid rice using an NCII population. Rice, 11(1), 1-10. DOI: 10.1186s1228401802234

Zheng, S. J., Li, Z. Q., & Wang, H. T. (2011). A genetic fuzzy radial basis function neural network for structural health monitoring of composite laminated beams. Expert Systems with Applications, 38(9), 11837-11842. DOI: 101016jeswa201103072

Published
2020-08-17
How to Cite
Sant’Anna, I. de C., Silva, G. N., Nascimento, M., & Cruz, C. D. (2020). Subset selection of markers for the genome-enabled prediction of genetic values using radial basis function neural networks. Acta Scientiarum. Agronomy, 43(1), e46307. https://doi.org/10.4025/actasciagron.v43i1.46307
Section
Genetics and Plant Breeding

 

2.0
2019CiteScore
 
 
60th percentile
Powered by  Scopus

 

2.0
2019CiteScore
 
 
60th percentile
Powered by  Scopus