Automatic Stain-Based Classification in Biological Images Using Adaptive Archetype Color Estimation

Abstract

An adaptive unsupervised algorithm is presented for automatic stain-based classification of cytological objects (cells, cell nuclei, etc.), or individual pixels, based on color. The proposed method is primarily intended for, though not limited to, work with images from microscope slides stained with exactly two different colored stains (e.g., a stain and a counter-stain), and a light background. A salient feature of this method is the ability to distinguish objects whose spectra overlap heavily enough that the color histogram does not exhibit well-defined clusters. This often occurs with stained biological microscopic specimens, where visually distinct colors are really different mixtures of sub-microscopic granules of the stains.

Given an image, a three-dimensional (3-D) color histogram is constructed and smoothed. From this histogram, a set of three "archetype colors", representing the two colors of the stained objects and the color of the background, are estimated. The proposed algorithm stands apart from the published literature in that it makes use of colors at the "outer surface" of the 3-D color histogram, rather than the modes (the points of highest frequency). The rationale for this choice is that these surface points represent the colors of the pure stains, and are therefore far more likely to be concordant with visual assessments. The experimental results appear to bear this out.

For a large quantitative batch study involving numerous slides, the archetype color estimation procedure need not be repeated for each image frame, but rather, whenever re-calibration is warranted. Once the archetype colors are estimated, pixel classification is conducted based on a set of linear decision boundaries on the color histogram. Pixel classification results can then be combined to conduct object classification; a weighted voting technique for color classification of objects is demonstrated.

This method does not require prior training or customization for a particular class of images. It is generically applicable to a rich variety of applications, without changing or adjusting the imaging instrumentation. This makes the method attractive for exploratory and pilot studies. It is also attractive for large-scale quantitation studies, such as assays, where robustness to unavoidable staining variations is desirable.

The broad applicability of the proposed method was evaluated over a collection of 76 images drawn from 19 slides representing specimens from medical biopsies, botanical objects, and protozoa. The fully automated procedure was found to successfully distinguish the important colors on all but three of the slides without any parameter adjustments. On two of the three slides on which the procedure failed, the intensity of one of the desired foreground archetype colors was extremely high compared to the other, which led to atypically shaped 3-D color histograms. On the third slide, which was considered a partial failure, one frame out of a half dozen that were captured exhibited an anomalous result that was traced to a failure of the histogram smoothing algorithm. Overall, it was concluded that the proposed algorithm has broad applicability.


  Back to Doug's Home Page