Context and problematic
The biostats R&D department of a world-renowned pharmaceutical company notes the inefficiency of existing cancer classification methods. The business also needs to be able to interpret the results obtained.
Goals
Prove the added value of Deep Learning in the classification of medical images even on small samples (>2000 images). Provide an interpretable model: the business must be able to understand the choice of the model. Classify these images according to 4 cancer categories: Lung, Breast, Prostate and Colon.
Our intervention
1 Data Scientist and 1 Data Engineer in SCRUM mode
- Creation and cleaning of the dataset
- Data preparation (normalization, on-the-fly data augmentation, etc…)
- Modeling using a Convolutional Neural Network (CNN) and more specifically a ResNet model with transfer learning and progressive resizing
- Model interoperability (confusion matrix, accuracy, precision, recall, heatmap, etc…)
Results
Technical environment
Python, Jupyter Lab, Pytorch, FastAI, Plotly
Azure
Api Rest
Networkx
Dash
Scrum