Fast and Scalable Image Search For Histology

Principal Investigator: Faisal Mahmood

Authors: Chengkuan Chen, Ming Y. Lu, Drew F. K. Williamson, Tiffany Y. Chen, Andrew J. Schaumberg, Faisal Mahmood
Lay Abstract

The expanding adoption of digital pathology has enabled the curation of large repositories of histology whole slide images (WSIs), which contain a wealth of information used for cancer diagnosis. Similar pathology image search offers the opportunity to comb through large historical repositories of gigapixel WSIs (I.e., like Google Reverse Image Search) to identify cases with similar features and can be particularly useful for diagnosing rare diseases, identifying similar cases for predicting prognosis, treatment outcomes, and potential clinical trial success. A critical challenge in developing a WSI search and retrieval system is how to keep search speed fast even in a large image database. Previous systems are typically slow and retrieval speed often scales with the size of the repository they search through, making their clinical adoption tedious and are not feasible for repositories that are constantly growing. Here we present Fast Image Search for Histopathology (FISH), a histology image search pipeline whose search speed is independent of the size of image database. We evaluated FISH on multiple tasks and datasets with over 22,000 patient cases spanning 56 cancers.

Scientific Abstract

The expanding adoption of digital pathology has enabled the curation of large repositories of histology whole slide images (WSIs), which contain a wealth of information. Similar pathology image search offers the opportunity to comb through large historical repositories of gigapixel WSIs to identify cases with similar morphological features and can be particularly useful for diagnosing rare diseases, identifying similar cases for predicting prognosis, treatment outcomes, and potential clinical trial success. A critical challenge in developing a WSI search and retrieval system is scalability, which is uniquely challenging given the need to search a growing number of slides that each can consist of billions of pixels and are several gigabytes in size. Such systems are typically slow and retrieval speed often scales with the size of the repository they search through, making their clinical adoption tedious and are not feasible for repositories that are constantly growing. Here we present Fast Image Search for Histopathology (FISH), a histology image search pipeline that is infinitely scalable and achieves constant search speed that is independent of the image database size while being interpretable and without requiring detailed annotations. We evaluated FISH on multiple tasks and datasets with over 22,000 patient cases spanning 56 disease subtypes.

Clinical Implications
Our system addresses the scalability issue in whole slide image search, which is one of key challenges that hinder such system from being used in the clinical setting

If the PDF viewer does not load initially, please try refreshing the page.