De Bruijn graphs recently found applications in

*Sequence Assembly*(

).

problem is close to

*Shortest Common super String*(

) problem, however
there are some fundamental differences. Given a set of input strings

strings the

problem outputs a string

such that every

is a sub-string in

and

. On the other hand the input to the

problem
is a set

of strings, every

is a randomly chosen
sub-string of a bigger string

- called the

*genome*. Unfortunately we do not
know what

is, so given

the

problem asks to find the

such that we can
explain every

. Adding to the complexity of

problem

may contain several
repeating regions and the direct appliction of

to solve

problem would
result in collapsing the repeating regions in

- so we cannot directly reduce
the

problem to

. Also the string

is special string coming from the

so its double stranded and each

may be originating from the forward or
reverse compliments of

.

A De Bruijn graph on a set of strings is a graph whose vertices
correspond to the unique mers (a string of length ) from all the
and whose edges correspond to a some mer in some . Also to consider
the double stranded-ness we also include a reverse compliment of every mer into
the graph . In the following simple example I show how a loop (i.e. both
the forward and reverse compliments overlapping each other by ) can originate
into . This interesting situation seems to be referred to as *hairpin-loop*.