evergreen 2023 logo
annotation genome bioinformatics

Enhancing phage therapy safety: Reliable and sensitive phage genome annotation with rTOOLS2 high-throughput pipeline

Abstract ID: 87-QZ

Antoine Culot 1*, Guillaume Abriat 1

  1. Rime Bioinformatics SAS

Phage therapy is an exciting and promising approach to fight bacterial infections. However, ensuring the safety and efficiency of phages for therapeutic use requires a thorough understanding of their genomic properties. This process enables detection of genes that make phages potentially harmful for the subject of the therapy or the environment, such as antibiotic resistance, lysogeny, and virulence genes.

Traditional bioinformatics tools designed for bacterial genomes are not well-suited for phage genomes due to their unique structure, leading to poor gene calling and function annotation. Recently, phage-focused tools have been released, such as Pharokka and rTOOLS2. rTOOLS2 is a multi-hypothesis, phage-focused annotation pipeline: its advanced algorithm uses the output produced by widely-used annotation tools to find more gene functions, with high evidence thresholds to avoid false positives.

In this study, 135 phage genomes published in Genbank were annotated using Pharokka and rTOOLS2’s high-thoughput version, and the results were compared.

Pharokka was able to improve the average published annotation, as the average number of genes functionally annotated grew from 29.5% to 35.9%. On the other hand, rTOOLS2’s high-throughput version was able to significantly increase the rate of annotated genes, reaching 54.6%.

To promote the safest possible use of phages for patients and for the environment, it is key to use thoroughly characterized phages. rTOOLS2’s high-throughput version can rapidly provide a strong basis for genome characterization. The use of curated databases ensure that meaningful annotations are provided, and results can be published with low risk of public database poisoning. Moreover, rTOOLS2 is able to produce more information, as it nearly doubled the number of annotated genes in the initially published genomes.