Speaker
Mikhail Pogorelyy
Description
High throughput sequencing of antigen receptor repertoires (RepSeq)
allows for the sequencing of millions of TCR/BCR sequences per
sample. However, our ability to extract clinically relevant information
from repertoire sequencing data is still limited. Here we present three
computational approaches to identify vaccination, infection, cancer, or
autoimmune disease-associated clonotypes from longitudinal RepSeq
data (several time points after treatment for one donor), population
RepSeq data (repertoires from patient cohort) and single repertoire
samples. First, we present a statistical model which detects clonal
expansion by analyzing TCR cDNA count data from quantitative
repertoire sequencing. We applied this model to TCRbeta repertoires of
three twin pairs after yellow fever immunization (YFV17D). We
identified 500-1500 expanded clonotypes in each donor and validated
them for YFV17D-specificity by three independent functional assays.
Second, we describe an algorithm to identify disease-specific
clonotypes using repertoires from cohorts of patients. A stochastic
model of TCR recombination is used to identify clonotypes shared
between more substantial numbers of patients than one could expect
by chance. This extensive sharing could only be explained by clonal
selection to same antigens on the periphery. Using this model, we
identified known public cytomegalovirus and ankylosing spondylitis-
associated clonotypes in the respective patient cohort
repertoires.vThird, we extended this approach from the population level
to single samples. We clustered TCRbeta clonotypes by significant
sequence convergence, which is estimated and weighted from a
stochastic TCR generation model. We show that identified clusters are
abundant after YF-immunization and consist predominantly of YF-
specific clones and almost absent before immunization. These three
approaches allow the identification of disease-specific TCR variants
using sequencing data only.
Primary author
Mikhail Pogorelyy