Abstract
Jian Peng
University of Illinois at Urbana-Champaign
Computer Science
Most protein sequence alignment algorithms rely on similarity measures between amino acids. Here we consider the problem on how to measure the similarity between sequence alignments. With good similarity metrics, we are able to use machine learning methods to learn more accurate alignment models than traditional ones, for structure prediction and/or homology search. We will first discuss the Hamming similarity distance and then introduce a new distance metric for pairwise sequence alignment. We will also present a general framework to utilize these similarity metrics to learn alignment models. Finally, we will discuss how to generalize these similarity metrics for multiple sequence alignments.