Ulysses G J Balis completed his MD training at the University of South Florida, USA his residency training in Anatomic and Clinical Pathology at the University of Utah, and separate, sequential Postdoctoral fellowships in Tissue Engineering and Bioinformatics at the Center for Engineering in Medicine at Harvard Medical School. He currently serves as the Director of the Division of Pathology Informatics at the University of Michigan, and is a member of the College of Fellows of the American Institute for Medical and Biological Engineering. He has published more than 100 papers in reputed journals and has given over 250 invited presentations, internationally, in the fields of pathology informatics and computational imaging.
Abstract
The application of convolutional neural network (CNN) based analysis to histopathological subject matter has already demonstrated significant utility for both general image classification tasks, as well as for implementation of unsupervised partitioning of datasets into multiple appropriate diagnostic subclasses. This approach is generally successful in settings where sufficient case numbers are available. Use of CNNs is attractive in that it can help to avoid the need for laborious generation of ground truth maps, as performed by subject matter experts. However, CNNs are limited in that their convergence on a generalizable solution often requires the availability of 100’s, if not 1000’s, of training set images. This lowers the utility of this approach for certain classes of histopathological subject matter, where large cohorts
of cases and images are unavailable. To address this limitation, we present the use of an additional pre- processing stage, with the use of hand-crafted feature classifiers, prior to the application of conventional CNN-based methodologies. This multi-stage approach has been applied to a number of histology classification use-cases, with preliminary results demonstrating that in many cases, the requirement for having hundreds or thousands of images can be significantly reduced, to instead only requiring a few representational images. A number of use cases will be presented, demonstrating consistently high ROC performance, even in the setting of having small numbers of initial cohort images in the training set.