Computational methods for denoising high-throughput data

Buri, Gershom

UCC IR Home
→
THESES & DISSERTATIONS
→
MASTERS
→
COLLEGE OF AGRICULTURAL & NATURAL SCIENCES
→
SCHOOL OF PHYSICAL SCIENCES
→
Department of Mathematics & Statistics
→
View Item

Computational methods for denoising high-throughput data

Buri, Gershom

URI: http://hdl.handle.net/123456789/3682

Date: 2015-07

Abstract:

T-cell diversity has a great influence on the ability of the immune system to recognise and fight the wide variety of potential pathogens in our environment. The current state of art approach to profiling T-cell diversity involves high-throughput sequencing and analysis of T-cell receptors (TCR). Although this approach produces huge amounts of data, the data has noise which might obscure the underlying biological picture. To correct these errors, two computational methods have been developed; a method of moments and a method based on Bayesian inference. Using simulated data, it is shown that Bayesian Inference is superior to the method of moments in terms of accuracy but the latter is preferable when time is a limiting factor as it is faster and adequately accurate. Furthermore, using high-throughput sequencing data, it is shown that significant differences exist between the raw and the denoised relative abundances of TCR V segments. For TCR J segments, however, the difference between raw and denoised data is minimal. This observation agrees with the fact that primers, which are used to enrich T-cell receptors before they are sequenced, and which are the main source of errors, are specific for TCR V segments.