First NameLee Wah Heng
Last NameCharlie
Supervisor NameKen Sung Wing-Kin
UniversityNational University of Singapore
KeywordsVirology Research, Bioinformatics Tools, LOMA, DNA Microarrays, PCR Methods, Kullback-Leibler Divergence, EvolSTAR, RB-Finder
Publication DateApril 30, 2015

Bioinformatic Applications For Virology Research 2010


The primary objective of this dissertation is to address a number of key challenges and issues in the detection, resequencing and evolutionary analysis of viruses. Using novel ideas to improve upon existing approaches, it aims to develop better technologies and bioinformatics tools that would have a greater impact on clinical decision-making.
Amplification of viral genomes is a necessary first step of diagnosis and sequence analysis. The thesis explores the pitfalls of using specific primers for amplification and proposes to use random-tagged primers, particularly for amplification of unknown viruses. Although it is theoretically possible for random-tagged primers to bind to any sequence, the blind use of such primers without careful design does not guarantee genome-wide amplification of the virus. In the second chapter, the thesis introduces a model to predict amplification efficiency of randomtagged primers and developed an algorithm, LOMA, to design random-tagged primers with optimal amplification efficiency. Experiments show that the random-tagged primers generated by LOMA can amplify up to 90% of the genomes of the target viruses.
In the third chapter, the thesis argues the advantages of using DNA microarrays for diagnostics over traditional PCR methods. To increase the sensitivity and specificity of microarray diagnostics, the thesis makes use of random-tagged primers for amplification and proposes an algorithm (PDA) that analyzes the distribution of probe signal intensities of in-silico recognition signatures probe sets of each virus based on a novel weighted Kullback-Leibler divergence that is sensitive to the tail of the distribution. Validation experiments show that PDA is able to accurately detect and identify co-infections of multiple viruses, as well as unknown viruses initially missed by PCR tests.
In the fourth chapter, the thesis demonstrates the feasibility of using resequencing microarrays as a large scale bio-surveillance tool. In the wake of the 2009 H1N1 influenza pandemic, a novel resequencing kit that is capable of interrogating all eight segments of the H1N1 2009 influenza, with accommodation for mutation hotspots, was developed. The accompanying base-calling software EvolSTAR is a new method that utilizes neighbourhood hybridization intensity profiles and substitution bias of probes on the microarray for mutation confirmation and recovery of ambiguous base queries. Validation experiments show that EvolSTAR can achieve a much higher accuracy and call rate than existing competing methods.
The fifth chapter discusses the role that recombination plays in the emergence of novel or more virulent strains of viral pathogens. Understanding the mechanisms of viral evolution will aid in the development of better anti-viral drugs, vaccines, as well as diagnostics and surveillance tools. The thesis presents an algorithm (RB-Finder) that uses a more informative distance metric
that overcomes the inaccuracies of methods that uses base-by-base comparisons. Experiments show that RB-Finder is able to achieve accuracies comparable to the most accurate phylogeny based methods but within a much shorter time. In addition, RB-Finder is able to distinguish regions of high mutation rates from recombination breakpoints.
In summary, the thesis has contributed several technologies and novel methods that have significantly improved existing bioinformatics approaches in virology research.

Download Thesis