This document describes the mission of Aperio Technologies, which is “Automating Pathology”.
Medicine has two parts – diagnosis, and treatment. Medical diagnosis has become highly automated. However, one area of diagnosis remains highly manual – the discipline of pathology.
Pathology is the study of disease; its causes, processes, development, and consequences. Over half of all medical diagnosis is performed by pathologists, and most pathological diagnosis is performed by manually inspecting fluid or tissue specimens with a microscope.
There are two kinds of specimens prepared for microscopic inspection. Cytology is the study of fluid specimens. Histology is the study of tissue specimens.
A cytology preparation is a “smear” of liquid, such as blood, embryonic fluid (amniocentesis), cervical fluid (pap smear), lymphatic fluid, etc. spread onto microscope slides. The smear is stained to provide visual contrast for cells suspended in the liquid, and to highlight abnormal conditions, such as metastasizing cancer cells. A cytologist visually inspects slides looking for “things which are wrong”; a rare event detection process akin to searching for a needle in a haystack.
Histology preparations are thin slices of tissue which are “floated” onto microscope slides, then stained and fixed. The stain(s) provide contrast to show tissue structure and cell details, and to identify distribution of proteins such as antigens. A histologist visually inspects slides looking for abnormal “architecture”, cells which exhibit unusual morphology, and to gauge the proportion and density of staining.
Automating pathology would yield two major benefits:
1) Time savings. Improve speed of diagnosis and productivity of pathologists.
2) Accuracy. Reduce false negatives and improve precision of measurements.
Each of these benefits translates directly into cost justification for investment in new technology. There are three steps to automating pathology:
1) Digitize slide data. A device must convert the visual data on a slide into a digital image at high resolution and with no image artifacts.
2) Manage “virtual slide” images. Digitized slide images (“virtual slides”) are very large. A system must store, retrieve, and display such images, and provide remote access.
3) Automate slide analysis. Create software which performs repetitive tasks such as rare event detection, assay quantification, and tissue classification (pattern recognition).
The Technical Obstacles
Since pathology is such a large and important part of diagnostic medicine, and since it remains highly manual and therefore represents a substantial opportunity for automation, what has kept automation from being adopted? There are several severe technical obstacles.
First and foremost, to automate pathology it is essential to digitize slide data. There are several practical obstacles to doing this efficiently at high resolution:
n The amount of data captured is very large. A typical pathology specimen measures 15mm x 20mm. At 400X magnification (using a 40X objective lens) the required image resolution is about .25µ/pixel (µ = micron = 1/millionth of a meter). That means the digital image of a typical specimen measures about 80,000 x 60,000 pixels, and contains about 14.4GB of data. That’s big.
n Maintaining accurate focus is quite difficult. Although a microscope slide appears “flat”, at sub-micron resolution it is actually full of hills and valleys. The “depth of field” of a typical high resolution microscope is about .2µ, and the hills and valleys of a sample can be 10µ high. There may also be a persistent “tilt” to a sample which can be as much as 100µ. So digitization of a slide sample requires constant and accurate focus adjustments.
n Elimination of optical artifacts is crucial. The usual approach to digitizing slides has been to take a picture of a microscope’s field of view, move over, take another picture, move over, etc. This is called “tiling” and results in thousands of pictures which must be “stitched” together to form a complete image of the specimen. Tiling is not only quite slow (it can take hours) but it also yields significant seam artifacts.
Secondly, automating pathology requires a system to manage “virtual slide” images. As noted above, virtual slides are very large. An automated pathology system must efficiently store and retrieve very large image files, and must support the following capabilities:
n Rapid panning and zooming. This capability is essential for viewing; pathologists typically view slides at low resolution and zoom in and out on “interesting things”. Virtual slides are too large to be loaded into a computer’s memory, so careful data organization is required to support direct access to any portion of a virtual slide, at any level of zoom.
n Incremental remote access. Sharing slide data over computer networks is a key benefit of automating pathology (“telepathology”). Because virtual slides are too large to copy easily, the system must enable incremental transmission of slide data for remote access.
n Database storage for metadata. Each pathology specimen has critical metadata associated with it; patient information, case notes, preparation notes, etc. These data must be stored in a database associated with the virtual slide images to enable rapid searching and maintenance. Annotation capabilities are likewise essential; annotations become part of the medical record associated with an image and must be managed by the database.
Third, automating pathology entails automating slide analysis. There are several important technical obstacles to doing so:
n Processing time. As noted, virtual slide images are very large. Computer analysis of very large image files must be performed efficiently and in a distributed fashion.
n Visualization of results. Slide analysis results must be displayed in a fashion which facilitates interpretation by pathologists and aids their diagnosis.
n Pattern recognition. Sophisticated pattern recognition techniques are needed to identify “things which are ‘different’” as well as “things similar to things seen before”. The usefulness of automated analysis depends directly on the accuracy of pattern matching.
Aperio Technologies has developed solutions for each of the technical obstacles described above, enabling – for the first time – the automation of pathology.
Digitize slide data
Aperio has developed the ScanScope®, a revolutionary device which can digitize entire slides at high resolution in minutes. It is based on patent-pending methods that combine a linear array detector with high-performance opto-mechanics. The ScanScope uses the same type of camera used in satellite photography; the scanner essentially “flies over” the slide, acquiring linear stripes of image data. The optical focus is automatically adjusted hundreds of times per second, yielding accurate focus over the “roughest” specimen terrain. The ScanScope is much faster than “tiling” systems, and crucially the resulting images have no optical artifacts.
Manage “virtual slide” images
Aperio’s platform software is a patent-pending “operating system” for virtual microscopy. It controls the ScanScope hardware, compresses virtual slides into a standard image file format, stores virtual slides in a standard database format, coordinates remote viewing of virtual slides, and supports virtual slide analysis. A file format has been devised which supports JPEG2000 wavelet encoding of image data, yielding a 20:1 reduction in file size. This format supports rapid panning and zooming to any portion of the virtual slide. Aperio has created a network transport mechanism to incrementally transmit slide data efficiently over wide-area networks, enabling remote viewing and analysis of slide data. A standard SQL database is used to store slide metadata, including annotations and automated analysis results.
Automate slide analysis
Aperio has developed a simple yet sophisticated architecture for automated analysis. The “algorithm framework” includes facilities for incremental access to slide data at any level of zoom, parameter processing, progress and result reporting, and distributed processing (“grid computing” for slide analysis). The results of algorithmic analysis are displayed as “overlays” on slide images, making interpretation straightforward and facilitating pathological diagnosis.
Aperio has developed several end-user applications which build on this framework to provide computer-aided-diagnosis:
- TMALab - specialized software for the analysis of tissue microarrays (TMAs).
- Rare event detection algorithms for finding metastasizing cancer cells in blood, and locating metaphase-spreads in amniocentesis and bone-marrow specimens.
- Image analysis algorithms for quantitative analysis of immunohistochemistry (IHC) assays, including her-2 expression for breast cancer.
- Novel pattern recognition algorithms based on featureless heuristics. These algorithms support content-based image retrieval and can quickly detect “known” patterns as well as flag “new” patterns.
Each of these solutions overcomes serious technical obstacles in novel ways. Automating pathology is a clear opportunity, and Aperio has the technology to make it possible.