Muhammad Shaban, PhD
Pronouns
He/Him/His
Job Title
Postdoc Researcher
Academic Rank
Research Fellow
Department
Pathology
Authors
Muhammad Shaban, Ming Y. Lu, Drew F. K. Williamson, Richard J. Chen, Jana Lipkova, Tiffany Y. Chen, and Faisal Mahmood
Principal Investigator
Faisal Mahmood
Research Category: Cancer
Tags
Accurate identification of a primary origin of metastatic tumors is essential for optimizing treatment and involves the integration of multiple forms of data during the examination of tissue by a pathologist. However, even with highly sensitive and specific immunohistochemical stains for some cell lineages, pathologists cannot reliably determine the origin of every metastatic tumor, with 1-2% classified as cancers of unknown primary (CUP) even with the integration of other clinical data. Previous work has shown the possibility of using artificial intelligence algorithms to predict primary origin using histology or different forms of molecular data, including genomics, transcriptomics, or methylation profiles. We present a multimodal deep learning algorithm that leverages routinely acquired histology slides, associated clinically-available genomics data, and patient sex to classify tumors into 18 different primary origins. Our approach shows substantial improvement over unimodal deep learning using histology or genomic data alone, achieving an accuracy of 88.1% and 92.0% on a held-out test (n=4,881) and external test set (n=660), respectively. Furthermore, on CUP cases (n=283), we observed an agreement of 85.5% between the model’s three most likely predicted origins and the differential diagnoses assigned in the associated pathology reports. At test time, our flexible model design enables origin prediction to be made from only histology or genomics alone, if necessary due to missing data. Additionally, our model allows us to perform interpretability studies to observe which parts of the histology and which genes contribute most to the prediction of a particular origin, a potentially useful tool for quality control and knowledge discovery.
Accurate identification of a primary origin of metastatic tumors is essential for optimizing treatment and involves the integration of multiple forms of data during the examination of tissue by a pathologist. However, even with highly sensitive and specific immunohistochemical stains for some cell lineages, pathologists cannot reliably determine the origin of every metastatic tumor, with 1-2% classified as cancers of unknown primary (CUP) even with the integration of other clinical data. Previous work has shown the possibility of using artificial intelligence algorithms to predict primary origin using histology or different forms of molecular data, including genomics, transcriptomics, or methylation profiles. We present a multimodal deep learning algorithm that leverages routinely acquired histology slides, associated clinically-available genomics data, and patient sex to classify tumors into 18 different primary origins. Our approach shows substantial improvement over unimodal deep learning using histology or genomic data alone, achieving an accuracy of 88.1% and 92.0% on a held-out test (n=4,881) and external test set (n=660), respectively. Furthermore, on CUP cases (n=283), we observed an agreement of 85.5% between the model’s three most likely predicted origins and the differential diagnoses assigned in the associated pathology reports. At test time, our flexible model design enables origin prediction to be made from only histology or genomics alone, if necessary due to missing data. Additionally, our model allows us to perform interpretability studies to observe which parts of the histology and which genes contribute most to the prediction of a particular origin, a potentially useful tool for quality control and knowledge discovery.