Abstract Detail


Goolsby, Eric [1], Richardson, Andrew [2].

Dealing with correlated observations in regularized regression: a phylogenetic elastic net approach for predicting leaf traits using leaf reflectance spectra.

Machine learning and related approaches are becoming increasingly common in ecology and evolution, with applications including automated phenotyping, species identification, diversity assessment, phylogenetic tree search, and many more. In plants, leaf trait prediction models are routinely developed from leaf spectral reflectance and phenotypic data using a variety of machine learning methods, such as partial least squares regression, ridge regression, random forests, support vector machines, etc. Biological datasets often violate the assumption that observations are independent and identically distributed (i.i.d.), and in cases where these assumption violations are severe, predictive models may be both overfit and overconfident as a consequence. Using simulations, we examine the impact of phylogenetic relatedness and intraspecific variation on the reliability of leaf trait prediction models using leaf spectral reflectance when covariance structures are ignored, and we compare these approaches to methods that appropriately account for these i.i.d. violations. We describe a method for explicitly accounting for phylogeny and intraspecific variation for ridge, LASSO, and elastic net regression, and we demonstrate these models on an empirical dataset consisting of 296 observations from 106 taxa sampled from the Arnold Arboretum of Harvard University, in which leaf trait prediction models are developed from leaf spectral reflectance. We provide recommendations for handling non-independence in machine learning based prediction models, as well as current limitations, strategies for dealing with computational bottlenecks, and potential areas of expansion.

1 - University Of Central Florida, 4000 Central Florida Blvd, BLDG20, BL301, Orlando, FL, 32816, United States
2 - Northern Arizona University, Center for Ecosystem Science and Society and School of Informatics, Computing, and Cyber Systems, 1295 Knoles Dr, Flagstaff, AZ, 86011, USA

Phylogenetic comparative methods
hyperspectral imaging
Phenotype Prediction
Machine Learning.

Presentation Type: Oral Paper
Number: MACRO II014
Abstract ID:789
Candidate for Awards:None

Copyright © 2000-2022, Botanical Society of America. All rights reserved