Recently I was working on an area efficient and synthesizable design to compute the edit script (minimum cost sequence of INSERT, DELETE and CHANGE) operations to transform string S1 to S2. After a little bit of thought I have the following idea to build a non-recursive version of the Hirschberg's algorithm, since the algorithm is non-recursive we can build an efficient digital circuit with this idea. We use a simple circular queue and apply DFS (Depth First Search) and we can prove that the capacity of this queue at any stage of the algorithm is θ(log(min(n1,n2)). The proof is simple we choose the geometrically decreasing string which has smaller length from the given strings(min(n1,n2)). Since we do a depth first search and the depth of subproblem tree is θ(log(min(n1,n2)), so we will have atmost θ(log(min(n1,n2)) subproblems in the circular queue at any stage of the algorithm.
The core non-recursive algorithm starts off with the initial problem and finds the minimum alignment split ('q') between (S1[1.....n1/2] and S2 creates two sub problems and appends in the circular queue (currently it uses BFS). Download the linear space implementation from here Following is the outline of the core non recursive implementation.
91 CQueue bfs_list; 92 ESSubProblem es_sub; 93 94 InitializeCQ(&bfs_list,sizeof(ESSubProblem)); 95 /*Put the toplevel subproblem into the CQueue*/ 96 es_sub.sub_s1 = s1; es_sub.sub_n1 = n1; 97 es_sub.sub_s2 = s2; es_sub.sub_n2 = n2; 98 99 AppendCQ(&bfs_list,&es_sub); 100 while(DequeueCQ(&bfs_list,&es_sub)){ 101 sub_s1 = es_sub.sub_s1; sub_s2 = es_sub.sub_s2; 102 sub_n1 = es_sub.sub_n1; sub_n2 = es_sub.sub_n2; 103 104 if(sub_n1 <=1){ 105 /*If sub_n1 is 1 or 0 we cannot split it any more*/ 106 EditDistanceLS(sub_s1,sub_n1,sub_s2,sub_n2); 107 i = sub_n1; j = sub_n2; 108 /*Figure out the edit operations*/ 109 while(i || j){ 110 op = GetOp(i,j,(i?sub_s1[i-1]:'\0'), 111 (j?sub_s2[j-1]:'\0')); 112 switch(op){ 113 case 'I': 114 EditScriptS2[(&sub_s2[j-1]) - s2] = 'I'; 115 j--; 116 break; 117 case 'D': 118 EditScriptS1[(&sub_s1[i-1]) - s1] = 'D'; 119 i--; 120 break; 121 case 'C': 122 EditScriptS1[(&sub_s1[i-1]) - s1] = 123 (sub_s1[i-1] == sub_s2[j-1])?':':'C'; 124 i--; j--; 125 } 126 } 127 }else { 128 /*Add the two subproblems*/ 129 q = FindMinSplit(sub_s1,sub_n1,sub_s2,sub_n2); 130 /*SUB PROBLEM 1*/ 131 es_sub.sub_n1 = CELING(sub_n1,2); 132 es_sub.sub_n2 = q; 133 AppendCQ(&bfs_list,&es_sub); 134 135 /*SUB PROBLEM 2*/ 136 es_sub.sub_s1 = &(sub_s1[CELING(sub_n1,2)]); 137 es_sub.sub_s2 = &(sub_s2[q]); 138 es_sub.sub_n1 = sub_n1/2; 139 es_sub.sub_n2 = sub_n2-q; 140 AppendCQ(&bfs_list,&es_sub); 141 } 142 }
No comments:
Post a Comment