EvenSampling

I was seconded to the Wellcome Sanger Institute at a time when almost every positive SARS-CoV-2 sample was being sent there for potential genomic surveillance. Sequencing capacity was capped at a certain number of samples per day, and I was tasked with developing an algorithm which would select plates for sequencing in order to:

  • provide geographical coverage across local authorities such that the number of samples sequenced was proportional to the number of cases (despite sometimes biased arrival of samples to the lab)
  • prioritise samples in key government priority areas such as those from care homes, and those from people arriving from abroad

I used OR-tools to find the optimal plate selection according to these priorities and constraints. The algorithm, called evensampling was implemented in Python and was run daily with sample manifests from the latest samples to select the plates for cherry-picking each day. This algorithm was responsible for selecting millions of samples during the pandemic, and evolved in response to changing priorities. We showed that the balance of samples selected improved substantially after its implementation.

Theo Sanderson
Theo Sanderson
Assistant Professor

Biologist developing tools to scale pathogen genetics.

Related