Sorbonne Universites, UPMC Univ Paris 06, CNRS, INSERM, Laboratoire Imagerie Biomedicale (LIB), Paris, France

Authors:

R. Venâncio, B. Ben Cheikh, A. Coron, and D. Racoceanu

Abstract:

Our method divides each 40x whole-slide image (WSI) in several non-overlapping 800x800 pixel sub-images that will be referred to as high-power fields (HPFs). Each HPFs is classified as cancerous or non-cancerous using a supervised learning algorithm, namely support vector machine (SVM). The use of SVM imposes a division of the method in two different parts: training and validation. The training step consists in detecting the lymph node (LN) sections and create the respective tissue-mask of these sections automatically. The mask is then divided into HPF which from each one 9 derived images were obtained and statistical features were extracted, based on gray-level co-occurrence matrices (GLCM) and Law’s texture energy measures. Due to the great number of features generated, a method based on sequential forward selection (SFS) was used in order to decrease this number, selecting only a few non-redundant features. The total number of HPFs was around 55k which would have a great computational cost, so to decrease it a subset of 2230 HPFs (1100 positive, 1130 negative) were randomly selected. Using a k-fold cross-validation (k = 10), the 2230 HPFs and the features selected by SFS we obtained a F-score of 0.838±0.024. In the end a classifier was trained. In the validation step, we proceeded in the same way as in the training step for the detection of LN sections and the extraction of the HPFs. Each HPF was then classified using the classifier trained previously, which gave a confidence classification degree for each HPFs. In the first evaluation, a threshold for the number of positive HPFs was specified to consider a LN cancerous. If the threshold is passed, a mean of the probabilities of each positive HPF is calculated. In the second evaluation, the HPF with the greatest confidence degree is selected and the X and Y coordinates of its central point provided.

Results:

The following figure shows the receiver operating characteristic (ROC) curve of the method.

The following figure shows the free-response receiver operating characteristic (FROC) curve of the method.

The table below presents the average sensitivity of the developed system at 6 predefined false positive rates: 1/4, 1/2, 1, 2, 4, and 8 FPs per whole slide image.

FPs/WSI 1/4 1/2 1 2 4 8
Sensitivity 0.031 0.044 0.071 0.133 0.191 0.249