University of Toronto, Electrical and Computer Engineering, Canada


Oren Kraus


Computer vision based technologies have undergone drastic improvements since the popularization of deep learning. For this challenge I developed a lesion segmentation method based on an ensemble of fully convolutional neural networks trained on different scales. The main challenge that arose was the high dimensionality of the images, especially at downsample levels below 32x. To utilize the higher magnification images I develop a method for sampling relevant regions during evaluation time. I set up a dataset by cropping regions near and away from positive ground truth areas and trained separate fully convolutional neural networks for downsample levels 32x, 16x, 8x and 4x. The networks are trained to predict the ground truth mask in the cropped region. The layers in the network include convolutional, pooling, deconvolutional, rectified linear activations, and batch normalization. At evaluation time it was too expensive to evaluate downsample levels below 32x. The coarsest model (32x) is used to evaluate the whole slide and to propose regions for the other models. Regions in lower downsamples are then dynamically evaluated based on the predictions from the coarsest model. Final predictions were generated by averaging the predictions from each scale, or by an additional fully convolutional network trained to aggregate the predictions from different scales. Finally, single coordinates were chosen by smoothing the final prediction map with a Gaussian filter and extracting the coordinates with highest probability for each detected lesion. For the whole slide classification task I trained logistic regression models based on measurements extracted from detected lesions. The model using a second fully convolutional network achieves an AUC score of 0.93 on whole image classification and a FROC score of 0.70 on lesion detection and localization on the validation set.


The following figure shows the receiver operating characteristic (ROC) curve of the method.

The following figure shows the free-response receiver operating characteristic (FROC) curve of the method.

The table below presents the average sensitivity of the developed system at 6 predefined false positive rates: 1/4, 1/2, 1, 2, 4, and 8 FPs per whole slide image.

FPs/WSI 1/4 1/2 1 2 4 8