## Thursday, December 24, 2009

### [TECH] An interesting problem on the mailing list [gdb@gnu.org]

Some one asked the following question of GDB mailing list.
Hi,

I was wondering how come the call to "call cos(.2)" returns the wrong

#include "math.h"
double    (*fcn)(double);
main()
{
double t;
fcn = cos;
t=cos(.2);
t=cos(-.2);
}

(gdb) call cos(.2)
$16 = 104 (gdb) call fcn(.2)$17 = 0.98006657784124163

Thanks,
--
View this message in context:  http://old.nabble.com/Why-%22call-cos%28.2%29%22-returns-garbage--tp26909516p26909516.html
Sent from the Gnu - gdb - General mailing list archive at Nabble.com.

Here is my analysis.
Hello,

Looks like 'gdb' is missing debug information about the function 'cos' ,
and the debugger implicitly assumes all the functions return an 'int' ,
if the function is not compiled in debug mode.

I derive my explanation from the following test case.

1. Create a function 'double test_cos(double)' and compile it in optimized mode with -O3 using gcc
"$gcc -c -O3 test1.c" 2. Complie the file containing the 'main' function in debug mode "$gcc -g test.c test1.o -lm"

============[gdb a.out]====================

(gdb) list 0
1       #include "math.h"
2       #include "test1.h"
3       double    (*fcn)(double);
4       main()
5       {
6                double t;
7                 fcn = cos;
8                  t=cos(.2);
9                   t=cos(-.2);
10                      test_cos(.2);
(gdb) call fcn(0.2)
$4 = 0.98006657784124163 (gdb) call cos(0.2)$5 = -1073792992
(gdb) call test_cos(0.2)
$6 = -1073792992 ===================================== ============[test1.h]================= #ifndef _test_1_h #define _test_1_h double test_cos(double); #endif ==================================== ============[test1.c]================== #include "math.h" #include "test1.h" double test_cos(double a){ return cos(a); } ==================================== ============[test.c]================== #include "math.h" #include "test1.h" double (*fcn)(double); main() { double t; fcn = cos; t=cos(.2); t=cos(-.2); test_cos(.2); } =================================== May be the developers of gdb can you more details. also ======================================= (gdb) call cos$6 = {text variable, no debug info} 0xc325d0 {cos}
=======================================



## Wednesday, December 23, 2009

### [TECH] Manipulating DNA sequences with bits

The alphabet of DNA sequences consists of 4 alphabets {A,T,G,C} and they exists in compliments -- A-T, G-C. Often in sequence assembly algorithms we need to squeeze these into bits for saving space since we have too many of them. For example a typical input to a sequence assembler consists of at least 10 million reads (string of fixed length, say 21). In my last post I briefly described the Sequence Assembly (SA) problem. As I pointed one of the basic operations on these reads would be to obtain a reverse compliment of a given read. For example our read is ATTACCAG , then its reverse compliment would be CTGGTAAT -- reverse the original sequence replace every symbol by its compliment (A-T, G-C). The following routines may be useful when we squeeze these symbols into bits using only 2bits per symbol. While using the bit-wise operators I realized that == operator has more precedence than & (bit-wise or). So if you a&b==b then that will always be true, so we need to be more careful use (a&b)==b.
#define MASK_A 3
/*Lets fix the maximum size of the read as 32 base pairs*/
typedef unsigned long long kmer_t;

/*Takes a read of length K and returns its bit represenation*/
kmer_t CreateBKmer(char *kmer, unsigned long K){
unsigned long i;
unsigned char bitc;
/*A binary k-mer*/
kmer_t bkmer = (kmer_t) 0;
for(i=0; i<K; i++){
bkmer <<= 2;
bkmer |= Char2Bit(kmer[i]);
}
return bkmer;
}

/*Return the reverse compliment of the kmer_a*/
kmer_t ReverseCompliment(kmer_t a, unsigned int K){
kmer_t rc=0;
unsigned int i=0;
for(i=0; i<K; i++){
rc <<=2;
rc |= (3-(a&3));
a >>=2;
}
return rc;
}
/*Print the ASCII equivalent of the underlying bits*/
void PrintKmer(kmer_t a, unsigned int K, FILE *ptr){
char ntide;

do{
ntide = 'A';
ntide = 'C';
ntide = 'G';
}else{ /*This should be the last because T=0*/
ntide = 'T';
}
fprintf(ptr, "%c", ntide);
fprintf(ptr, "\n");
}

inline char Bit2Char(unsigned char a){
switch(a){
return 'A';
return 'G';
return 'C';
return 'T';
}
}

inline unsigned char Char2Bit(char c){
switch(c){
case 'A':
case 'T':
case 'G':
case 'C':
}
}

## Friday, December 18, 2009

### [TECH] Loops in Bi-directed De Bruijn Graphs

De Bruijn graphs recently found applications in Sequence Assembly(). problem is close to Shortest Common super String() problem, however there are some fundamental differences. Given a set of input strings strings the problem outputs a string such that every is a sub-string in and . On the other hand the input to the problem is a set of strings, every is a randomly chosen sub-string of a bigger string - called the genome. Unfortunately we do not know what is, so given the problem asks to find the such that we can explain every . Adding to the complexity of problem may contain several repeating regions and the direct appliction of to solve problem would result in collapsing the repeating regions in - so we cannot directly reduce the problem to . Also the string is special string coming from the so its double stranded and each may be originating from the forward or reverse compliments of .

A De Bruijn graph on a set of strings is a graph whose vertices correspond to the unique mers (a string of length ) from all the and whose edges correspond to a some mer in some . Also to consider the double stranded-ness we also include a reverse compliment of every mer into the graph . In the following simple example I show how a loop (i.e. both the forward and reverse compliments overlapping each other by ) can originate into . This interesting situation seems to be referred to as hairpin-loop.

## Saturday, December 12, 2009

### [TECH] Simplified Proof For The Application Of Freivalds' Technique to Verify Matrix Multiplication

We first give a simple and alternative proof for Theorem- in [Randomized Algorithms, Motwani R. and Raghavan P., pp 162--163]. Later in Theorem 2 we show that the assumption on the uniformness is not necessary.

Proof. Let be a matrix and be the column vectors of . Then . This means that multiplying a vector with a matrix is linear combination of the columns, the coefficient is the component of . Since is a boolean and acts as an indicator variable on the selection of column . So if is chosen from a uniform distribution .

Now let and be the column vectors of , similarly let be the column vectors of . Let , clearly since . Then since . Intuitively this means we select our random vector such that for all , such a selection will always ensure even though .

Proof. Continuing with the proof of Theorem- 1 , .