Abstract:
T-cell diversity has a great influence on the ability of the immune system
to recognise and fight the wide variety of potential pathogens in our environment. The current state of art approach to profiling T-cell diversity
involves high-throughput sequencing and analysis of T-cell receptors (TCR).
Although this approach produces huge amounts of data, the data has noise
which might obscure the underlying biological picture. To correct these
errors, two computational methods have been developed; a method of moments and a method based on Bayesian inference. Using simulated data, it
is shown that Bayesian Inference is superior to the method of moments in
terms of accuracy but the latter is preferable when time is a limiting factor
as it is faster and adequately accurate. Furthermore, using high-throughput
sequencing data, it is shown that significant differences exist between the
raw and the denoised relative abundances of TCR V segments. For TCR J
segments, however, the difference between raw and denoised data is minimal. This observation agrees with the fact that primers, which are used to
enrich T-cell receptors before they are sequenced, and which are the main
source of errors, are specific for TCR V segments.