GitHub - gilbertella/semar_wp1: This R project supports the analysis of sewage-based antibiotic resistance surveillance data generated using two approaches: an isolate-based approach and a gene-based metagenomic approach. The overall aim of the project is to directly compare, combine, and evaluate these approaches to determine how well they reflect clinical antibiotic resistance.

About the project

This R project supports the analysis of sewage-based antibiotic resistance surveillance data generated using two approaches: an isolate-based approach and a gene-based metagenomic approach. The overall aim of the project is to directly compare, combine, and evaluate these approaches to determine how well they reflect clinical antibiotic resistance rates in Escherichia coli.

The study uses municipal sewage samples collected across ten European countries. From these samples, antibiotic resistance was assessed through two strategies. First, an isolate-based approach was applied using susceptibility testing of collected E. coli isolates. Second, a gene-based approach was applied using metagenomic sequencing to quantify antibiotic resistance genes in sewage.

Data Sources

This project uses data from three main sources:

Isolate-based sewage surveillance data

Antibiotic susceptibility testing results from E. coli isolates recovered from municipal sewage samples.

Gene-based sewage surveillance data

Metagenomic sequencing outputs describing the abundance and distribution of antibiotic resistance genes in sewage samples.

Clinical antimicrobial resistance data

Country-level clinical resistance prevalence estimates for E. coli, covering aminopenicillins, fluoroquinolones, third-generation cephalosporins, and aminoglycosides.

Analysis Approach

The analysis is implemented in R and focuses on data cleaning, integration, statistical modelling, and visualization. The core modelling framework uses beta regression, which is appropriate for proportional outcomes such as resistance prevalence.

The workflow includes:

Importing and cleaning sewage isolate, metagenomic, and clinical datasets.
Aggregating resistance indicators by country, sample and antibiotic class.
Matching sewage-derived indicators to country- and antibiotic class-matched clinical resistance outcomes.
Fitting beta regression models to quantify associations between sewage-based indicators and clinical resistance prevalence.
Comparing model performance across isolate-based, gene-based, and combined approaches.
Generating figures and summary tables for interpretation and reporting.

Software

R (version 4.3 and above)

Usage

To rerun this analysis, follow the steps below.

Clone the repo

git clone https://github.com/gilbertella/semar_wp1.git

Install the following R packages

library(ggplot2)
library(pals)
library(knitr)
library(cowplot)
library(betareg)
library(tidyverse)

Load the analysis_file.Rmd and run all chunks
View outputs in the tables and figures folders

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
README.md		README.md
analysis_file.Rmd		analysis_file.Rmd
analysis_file.nb.html		analysis_file.nb.html
load_the_data.R		load_the_data.R
load_the_data.Rmd		load_the_data.Rmd
semar_wp1.Rproj		semar_wp1.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About the project

Data Sources

Isolate-based sewage surveillance data

Gene-based sewage surveillance data

Clinical antimicrobial resistance data

Analysis Approach

Software

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About the project

Data Sources

Isolate-based sewage surveillance data

Gene-based sewage surveillance data

Clinical antimicrobial resistance data

Analysis Approach

Software

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages