Saturday, October 27, 2007

[TECH] A simple O(n^2) time algorithm to construct the Ultra-metric trees.

Given a matrix 'D' which is symmetric an Ultra-metric tree 'T' is a binary tree with internal nodes corresponding to D(i,j) and leaves corresponding to rows in the matrix 'D', and a path from the root to the leaf must be strictly decreasing (Ultra-metric tree). Also D(i,j) is the lowest common ancestor for 'i' and 'j'.
The problem is given 'D' we need to figure out corresponding 'T'.

The well know fact about the Ultra-metric trees is that if we take any 3 leaves i,j,k the maximum is not unique (e.x D(i,j)=5, D(j,k)=6,D(i,k)=5) we have D(i,j) and D(i,k) having same value.
How do we test if D is Ultra-metric?

  • A trivial solution is O(n^3).
  • The Errata shows construction of Ultra-metric trees in O(n^2) and asks a question if we can check for the Ultra-metric property in O(n^2) ? which my next item answers affirmatively.
  • My observation is we can partition the leaves of the ultra-metric tree with out any need of building a complete weighted graph as in Gusfields solution, one fact is that every row in D is going to have a maximum (global) we don't need to spend O(n^2) (comparing all the elements in the matrix D) to find one. The algorithm below we perform such a check if an ultra-metric tree exists and also builds one.
    L = {1,2,....n};
     i = select_a_random_element_in(L) ;
     max = find_max(D(i,*)); /*Takes linear time*/
     /*max = D(i,q) && q should be in L*/
     Part_q = {};
     Part_i = {};
     foreach(j in L){
            if(D(j,i)<=max && D(j,q)==max){
                 Part_i += {j};
            }else if(D(j,q)<=max && D(j,i)==max){
                 Part_q += {j};
               printf("D is not ultra-metric");
               return 0;
Running time would be T(n) = T(n-k) + T(k) + O(n), clearly its quadratic O(n^2).

No comments: