Modelling the susceptibility of Pseudomonas aeruginosa to phages at the strain level
Cédric Lood 1*, Emma Verkinderen 1, Stefaan Verwimp 1, Alicja Ochman 1, Alexandra Petrovic Fabijan 3, Pieter-Jan Haas 4, Jonathan Iredell 3, Pieter-Jan Ceyssens 5, Vera van Noort 1,6, Rob Lavigne 1
- Department of Biosystems, KU Leuven, Leuven, Belgium
- Department of Microbial and Molecular Systems, KU Leuven, Leuven, Belgium
- Centre for Infectious Diseases and Microbiology, Westmead Institute for Medical Research, Sydney, New South Wales, Australia
- Department of Medical Microbiology, University Medical Centre Utrecht, Utrecht, The Netherlands
- Division of Human Bacterial Diseases, Sciensano, Brussels, Belgium
- Institute of Biology, University of Leiden, Leiden, The Netherlands
Cédric Lood
Pseudomonas aeruginosa is an opportunistic pathogen that we use as a model system to study interactions with bacteriophages. From a genomics perspective, the population of P. aeruginosa presents a tremendous amount of diversity, with pairs of strains harboring accessory genetic components. These accessory regions include determinants of host-virus interactions such as defense systems, membrane receptors, or prophages that together impact the susceptibility of individual strains to various phages. Our aim is to build models of phage susceptibility that can i) guide the rapid selection of specific virus isolates from phage banks to target given strains; and ii) reveal phage-bacteria interaction determinants through data mining. We performed large activity screens of our phages against hundreds of P. aeruginosa isolates. This interaction dataset provides us with thousands of positive and negative interactions while the genomics gives us features to train predictors with machine learning techniques. Our models can be broadly split in white box models which allow introspection and biological interpretation and black box models which enhance efficient phage selection but trade-off interpretation. We show that the population structure of P. aeruginosa informs greatly the distribution of phage-bacteria interaction determinants linked to susceptibility. The models we propose yield insights on two fronts: first, we answer an operational question and show how can we rank phages from banks by likelihood of their effectiveness against given strains; and second, we highlight how biological insights can be gathered from data-driven analyses.