Meeting Abstract
As major sections of the Tree of Life become increasingly resolved, a central challenge for comparative study involves the scaling of phenotypic data collection efforts. Here we explore two techniques that enable high throughput (thousands to tens of thousands of species) data collection and analysis of 2D images. The first uses crowd-sourcing through the Amazon mechanical turk interface and involves a web browser-based landmarking application that is deployed to remote workers. These workers may be paid (through the Amazon mechanical turk framework) or the browsing application can be distributed to citizen scientists or research teams to facilitate parallelization of data collection. We demonstrate how this approach can be flexibly adopted to collect geometric morphometric and spatial distribution data. The second uses supervised and unsupervised machine learning to classify fish color patterns. We illustrate how image data may be used to train classifiers to recognize fish color pattern traits at broad phylogenetic scales and explore unsupervised algorithms for color pattern discovery.