Yizhou Yin, Yanran Wang, Johnathan R. Azaria, Yuxiang Jiang, Biao Li, Rita Casadio, Silvio C E Tosatto, Vikas Pejaver, Roger A. Hoskins, Matthew Edwards, Giulia Babbi, Richard McCombie, Alexander A. Morgan, Billy Chang, Emanuela Leonardi, Kymberleigh A. Pagel, Mehdi Pirooznia, Eran Bachar, Pietro Di Lena, Andre Franke, John Moult, Britt-Sabina Petersen, Ron Unger, Steven E. Brenner, Maggie H Wang, Susanna Repo, James B. Potash, Abhishek Niroula, Carlo Ferrari, Laksshman Sundaram, Teri E. Klein, Alessandra Gasparini, Xiaolin Li, Roxana Daneshjou, Kunal Kundu, Marco Carraro, Rajendra Rana Bhat, Sohela Shah, Predrag Radivojac, David T Jones, Pier L Martelli, Yana Bromberg, Mauno Vihinen, David Gifford, Samuele Bovo, Yanay Ofran, Sean D. Mooney, Lipika R. Pal, Manuel Giollo, Russ B. Altman, Peter Zandi
Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype–phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype–phenotype relationships.
The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. We discuss the range of techniques used for phenotype prediction, the methods used for assessing predictive models, and the lessons gleaned from the CAGI exomes challenges.