ICYMI Recap: Detecting COVID-19 with Deep Learning and DarwinAI

Barrett Williams and Michael McCourt
Advanced Optimization Techniques, Applied, Augmented ML Workflow, Convolutional Neural Networks, Deep Learning, Focus Area, Methodology, Vision

This past week, Dr. Alex Wong and Dr. Mike McCourt discussed COVID-Net, a neural network designed to diagnose COVID-19 cases. Here are the main takeaways from their presentation:

  • The University of Waterloo collaborated with DarwinAI to design COVID-Net to address this important problem
  • SigOpt optimized data augmentation and neural network parameters of COVID-net, and this model outperformed industry standard VGG-19 and ResNet-50 models
  • The COVIDx dataset is published with the intent of enabling more chest x-ray submissions, as well as encouraging region-specific training, tuning, and modification

And here is a more specific summary of the presentation. Click through to view any segment you missed:

  • Michael McCourt asked Dr. Alexander Wong about the relationship between DarwinAI and University of Waterloo: Dr. Wong has been working on AI-assisted medical diagnostic workflows for the past two decades (2:55)
  • DarwinAI allows the VIP lab to address ML and modeling challenges in industry (4:42)
  • Chest radiography is faster than RT-PCR tests to assess both diagnosis and severity during triage: COVID-net aims to help physicians make better decisions faster (6:12)
  • The visual difference between standard pneumonia scans and COVID-19 is quite subtle, “ground glass” opacity (8:25)
  • Discussion of model design and feature design: regular feedback from physicians (10:26)
  • Introduction to DarwinAI’s Gensynth solution for model generation and explainability (15:20)
  • The public COVIDx dataset consists of chest X-ray images that will be made available to the public (17:41)
  • Results of performance validation and confusion matrix: positive predictive value and sensitivity (avoid false positives, which are a burden for clinicians) (20:44)
  • Comparison of COVID-Net classification to that of unoptimized VGG-19 and ResNet-50 (21:56)
  • Q: What about human interpretation? Useful for confirmation and a second opinion. Comparison to radiologist assessments in the future (22:44)
  • Human in the loop assessment is essential; radiologists are in too high demand given hospital COVID caseload in many countries (24:02)
  • Q: How large is the data set? 13,000 chest X-ray images, but only 100 COVID-19 samples; there is geographic diversity (26:02)
  • Q: How do you choose images from public datasets? Quality and diversity, including from Mila and other reputable medical institutions (26:51)
  • Q: Is it possible to tune and choose a model architecture at the same time? Machine-driven model exploration and machine-driven optimization are both essential (28:12)
  • A discussion of explainability-driven analysis in GenSynth, to avoid “background cues” and other data issues (31:29)
  • McCourt: There was a need for a focus on data augmentation; dataset has small number of high-quality COVID-19 samples (34:15)
  • Introduction to SigOpt and model optimization for data augmentation (35:09)
  • Manipulations through augmentation: zoom, translation, rotation, brightness, and trimming (36:19)
  • Collaboration with former SigOpt intern, Linda Wang, and current SigOpt software engineer Olivia Kim (36:31)
  • Sensitivity and positive predictive value (PPV) are competitive metrics, and can be analyzed along a Pareto frontier (39:18)
  • Q: Will there be a domain shift problem, with data from different countries? India, for example? Or lower quality scans? (42:39)
  • There’s always the potential for domain shift, for example from different views, diversification through portable and fixed devices; improvement in models and data diversity should mitigate these problems (43:25)
  • Q: Is it helpful to customize the model for a specific demographic, or train only on regional data? (45:21)
  • Dr. Wong: the goal is to make the model, data set, and tooling as generalizable as possible, then enable researchers or physicians to customize based on their needs (45:30)
  • GenSynth: generator iteratively generates new neural networks, and the inquisitor analyzes network behavior (50:29)
  • Goal for COVID-Net is to have a real-world ongoing impact (52:32)
  • Goal for SigOpt, going forward, is to determine what aspects of data augmentation are most helpful, and how they can be transferred to other fields, like additive manufacturing (54:33)

For more information, please check out the original COVID-Net preprint on ArXiv. To contribute to the growing dataset, please navigate here.

If you joined us live, thank you for taking the time to watch. If you’d like to watch the recording you can find it here, and the slides here. If you’re interested in learning more, follow our blog or try our product. If you’re an academic and would like to use our product for free, please fill out this form. Full access to SigOpt is free for researchers investigating COVID-19 or its impact.

Barrett-Williams
Barrett Williams Product Marketing Lead
MichaelMcCourt
Michael McCourt Research Engineer