A De Bruijn graph
on a set of strings
is a graph whose vertices
correspond to the unique
mers (a string of length
) from all the
and whose edges correspond to a some
mer in some
. Also to consider
the double stranded-ness we also include a reverse compliment of every
mer into
the graph
. In the following simple example I show how a loop (i.e. both
the forward and reverse compliments overlapping each other by
) can originate
into
. This interesting situation seems to be referred to as hairpin-loop.
Algorithms, Theory, Spirituality, Life, Technology, Food and Workout : trying to sort these deterministically in $\Theta(1)$ time (constant time).
Friday, December 18, 2009
[TECH] Loops in Bi-directed De Bruijn Graphs
De Bruijn graphs recently found applications in Sequence Assembly(
).
problem is close to Shortest Common super String(
) problem, however
there are some fundamental differences. Given a set of input strings
strings the
problem outputs a string
such that every
is a sub-string in
and
. On the other hand the input to the
problem
is a set
of strings, every
is a randomly chosen
sub-string of a bigger string
- called the genome. Unfortunately we do not
know what
is, so given
the
problem asks to find the
such that we can
explain every
. Adding to the complexity of
problem
may contain several
repeating regions and the direct appliction of
to solve
problem would
result in collapsing the repeating regions in
- so we cannot directly reduce
the
problem to
. Also the string
is special string coming from the
so its double stranded and each
may be originating from the forward or
reverse compliments of
.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment