Instructions: Clicking on the section name will show / hide the section.
REQUIREMENTS AND PRIOR KNOWLEDGE
There is no formal prerequisites for this course.
GENERAL DESCRIPTION OF THE SUBJECT
This is an introductory course that covers some of the most fundamental topics of exact string pattern recognition.
There will be general descriptions of those topics, but there will not
be an in-depth discussion of each. Instead, the course is intended to
give the student an overview of the field.
OBJETIVES: KNOWLEDGE AND SKILLS
The goals of this course include:
To know the theoretical and algorithmic foundations of exact string pattern recognition.
To provide the students with a hands-on approach that will include
their knowing practical issues involved in the programming of patternrecognition algorithms.
To know the main applications of exact string pattern recognition to other problems in computer science.
To know some applications of exact string pattern recognition to problems found in other fields, in particular, in Computational Biology and Computational Music Theory.
Course notes written by Paco Gomez.
EVALUATION ACTIVIVTIES OR PRACTICALT ASKS
For the February examination session:
Attendance of the 75% of sessions is required.
Course grade will be assigned based on scores on four homework
assignments. There will be both theoretical and practical (programming)
assignnments. There will at least 4 assignments and at most 6,
depending on time and pace.
Each assignment will have the same weight over the final grade.
One of the assignments will consist of a programming project.
A pass is obtained with 50 points over 100.
For the other examination sessions students will have to hand over a project (60%) and write an exam (40%).
Review of some basic concepts on complexity, data structures and
Exact pattern recognition. The brute-fore algorithm. Algorithms
based on preprocessing. Preprocessing in linear time. Linear-time exact
The Boyer-Moore algorithm. Analysis of their complexity. The
Knuth-Morris-Pratt algorithm. Pattern recognition with finite automata.
Real-time string matching.
Preprocessing in the Knuth-Morris-Pratt algorithm. Exact matching
with a set of patterns.
The edit distance between two strings. Dynamic programming
calculation of edit distance. String similarity.
Introduction to sufix trees. The naive algorithm to build sufix
trees. Ukkonen’s linear-time suffix tree algorithm. Practical
Aplications of exact string pattern recognition algorithms. Suffix
trees and the exact set matching problem. The substring of more than
two strings. Longest common substring of two strings. DNA
contamination. Circular string linearization. The edit distance and the
problem of melodic similarity.