Sunday, November 16, 2014

Pseudo Polynomial Dynamic Programming

Sub Set Sum and Integer Knapsack are few examples of combinatorial optimization problems with pseudo polynomial running time algorithms. Often we the implementation of these pseudo polynomial dynamic programming algorithms is done using a huge array to hold intermediate sub problems (e.g. $OPT[K][n]$ would indicate a presence of a subset in the first $n$ integer elements ($x_1,x_2\ldots x_n$) which sum to $K$). If $N$ is the bound on the subset value you were looking for then the space complexity would $\Theta(Nn)$ -- which is clearly exponential on the input size which is $O(n\log(N))$ bits. To build robust algorithms -- reduce the dependence on the value of $N$ -- for these pseudo polynomial algorithms, it would be efficient if we could exploit the underlying DAG (Directed Acyclic Graph) nature of any dynamic programming formulation. But that might be a little bit more code to traverse the sub-problems in a topological order. However we can make our life a little easy by using a hash table to keep track of the sub-problems, which might add to the worst case asymptotic runtime by a factor of $O(log(nN))$. On the other hand the average case of sub-problem lookup would be a constant and would run fairly efficiently in practice.

The following code is a quick illustration of using a hash tables to keep track (and also lookup) of sub-problems in pseudo polynomial dynamic programming algorithms. You can read the problem statement here . Notice the obvious generalization of the problem (scheduling with constant number of resources) from the way I have formulated the dynamic program.


// 11/15/2014: vamsik@engr.uconn.edu//
#include 
#include 
#include 
#include 
#include 
#include 
using namespace std;
typedef std::pair KeyType;
struct KeyTypeHasher{
    inline size_t operator()(const KeyType& a) const {
        return (a.first)*1729 + a.second;
    }
};
typedef std::unordered_map SubProbType;
typedef typename SubProbType::iterator SubProbItr;

bool IsFeasible(const std::vector& A, unsigned int G){
    SubProbType sub_problems_even, sub_problems_odd;
    SubProbItr itr;
    
    //init//
    if(A[0] <= G){
        sub_problems_even[KeyType(A[0], 0)] = true;
        sub_problems_even[KeyType(0, A[0])] = true;
    }
    
    for(size_t i=1; i < A.size(); i++){
        SubProbType& curr_problems = 
            (i%2) ? sub_problems_even : sub_problems_odd;
        SubProbType& next_problems =
            (i%2) ? sub_problems_odd : sub_problems_even;
        
        if(curr_problems.empty()){
            return false;
        }
        next_problems.clear();
        //create new sub-problems in the next level//
        for(itr = curr_problems.begin(); 
            itr != curr_problems.end(); ++itr){
            const KeyType &a = itr->first;
            
            if( (a.first + A[i]) <= G){
                next_problems[ KeyType((a.first+A[i]), a.second)] = true;
            }
            
            if( (a.second + A[i]) <= G){
                next_problems[ KeyType(a.first, (a.second + A[i]))] = true;
            }
        }
        curr_problems.clear();
    }
    
    return ((A.size())%2) ? !(sub_problems_even.empty()) : !(sub_problems_odd.empty());
}

int main() {
    /* Enter your code here. Read input from STDIN. Print output to STDOUT */
    size_t T,N;
    unsigned int G, a;
    scanf("%lu", &T);
    for(size_t i=0; i < T; i++){
       std::vector A;
       scanf("%lu %u",&N, &G);
       for(size_t i=0; i < N; i++){
           scanf("%u", &a);
           A.push_back(a);
       }
       printf("%s\n", IsFeasible(A, G) ? "YES" : "NO"); 
    }
    return 0;
}

Monday, August 25, 2014

Planarity Puzzle

I'm obsessed with the untangling of the planar graphs. I'm now able to untangle a 30-vertex planar graph in ~240sec see below (yellow one is the final solution):


Sunday, August 10, 2014

Sunday, January 05, 2014

Useful properties of the function $$f(x) = \frac{x}{x-k}$$

The function f(x) = x/(x-k) has some very useful properties -- especially when we restrict x to Integers. Let me start with a simple question about Maxima and Minima of f(x). Applying elementary Calculus may not be useful since f(x) ↦ +∞ as x ↦ k+ (goes to -∞ from the other side and is discontinuous). A quick plot on WolframAlpha reveals the vertical asymptotes [http://wolfr.am/JSXDdV].



Now lets move our attention to the case where x is restricted to Integers. First lets consider the function f(x) when x takes on integers strictly greater than k (i.e. x > k). Notice that f(x) is decreasing the maximum in this case occurs when x = k+1. This gives us the following useful fact:
$$
f(n) = \frac{n}{n-k} \leq k+1 \,\,\,\,\,\,\, n \in \{ x\, | \, x > k \vee x \in \cal{I} \}
$$

The above inequality is quite handy in your combinatorial analysis tool book. Let me give you a simple application of this inequality by proving that nlog(n) = O(log(n!)) (also recall that log(n!) = O(nlog(n)) which means nlog(n) = Θ(log(n!))).

$$
\begin{array}{lcl}
&& \frac{n}{n-k} \leq k+1 \,\,\,\,\,\, k \geq 1\\
&& \\
\rightarrow && n \leq (n-k)\times (k+1) \\
\rightarrow && n < (n-k) \times (k+k) = 2 (n-k)\times k
&& \\
&&\\
\rightarrow && n < 2(n-1)\times 1\\
\rightarrow && n < 2(n-2)\times 2\\
\rightarrow && n < 2(n-3)\times 3\\
&& \ldots \\
\rightarrow && n < 2(n-(n-1))\times (n-1) = 2(1\times (n-1)\\
&&\\
&&\mbox{multiplying all the above n-1 inequalities you get the following} \\
&&\\
\rightarrow&& n^{n-1} < \displaystyle\Pi_{k=1}^{n-1} 2(n-k)\times k = 2^n ((n-1)!)^2 \\
\rightarrow&& n^n < 2^n (n!)^2 \\
\rightarrow&& \log(n^n) < \log(2^n(n!)^2) = 2\log(n!) + n \\
\rightarrow&& n\log(n) < 2\log(n!) + n < 2\log(n!) + \log(n!) = 3\log(n!) \\
\rightarrow&& n\log(n) = O(\log(n!)) \,\, \Box\\
\end{array}
$$

Saturday, November 30, 2013

Getting around with integer overflows during Combinatorial counting

Many closed form expressions characterizing the number of combinatorial objects involve manipulation of raising and falling factorials (e.g. $$\displaystyle\Pi_{i=1}^{k} (n-i+1)$$). These quantities accumulate very fast we need to use extreme caution during computation. Lets take the example in which we want to compute $${n\choose k} = \frac{n\times (n-1)\ldots (n-k+1)}{k\times k-1\times \ldots 1}$$ which is the ratio of two falling factorials which is always an integer. A naive algorithm which computes numerator and denominator will easily overflow even though the actual value is well within the limits of an integer. If $$n_{max}$$ is the largest value represented by the underlying data-type, the naive algorithm would overflow when: $$n\geq \sqrt[k]{n_{max}}$$. One way to fix this issue is to maintain the ratio implicitly as two integers and knocking off the GCD when ever possible. This reduces the overflow during computation see below for a simple example. Thanks to my friend Chittu for discussing this over lunch.


  public static long computeNchooseK(int n, int k){
    assert n >=1 && k<=n : "Invalid n and k to compute";
    long num=1, denom=1, factor;
    do{
      factor = gcd(n,k);
      num *= (n--/factor); denom *= (k--/factor);
      factor = gcd(num, denom);
      num /= factor; denom /= factor;
    }while(k>=1);
    assert denom == 1 : "Something went wrong in computing NchooseK";
    return num;
  }

Saturday, September 21, 2013

$\sum_{k=1}^{n} H_k = (n+1)(H_{n+1} - 1)$

$$
\frac{1}{1-z} = 1 + z + z^2 + z^3 \dots \\
\int \frac{1}{1-z}dz = \frac{z}{1} + \frac{z^2}{2} + \frac{z^3}{3} \dots = \sum_{k=1}^{\infty} \frac{z}{k} z^k = -\ln(1-z)\\
\\
H_n = \sum_{k=1}^{n} \frac{1}{k} \hspace{0.5in} \mbox{By definition, also notice they are partial sums}\\
\rightarrow \mbox{O.G.F of $H_n$ is}\hspace{0.02in} \frac{1}{1-z}\times (-\ln(1-z)) = \frac{1}{1-z}\ln(\frac{1}{1-z}) \hspace{0.5in}\mbox{from our previous result on partial sums}\\
\rightarrow \frac{1}{1-z}\ln(\frac{1}{1-z}) = H_1\times z + H_2\times z^2 + H_3\times z^3 \ldots \hspace{0.5in} \mbox{(1)}\\
\rightarrow \frac{d}{dz}(\frac{1}{1-z}\ln(\frac{1}{1-z})) = \frac{1}{(1-z)^2}\ln(\frac{1}{1-z}) + \frac{1}{(1-z)^2} = H_1 + 2H_2\times z + 3H_3\times z^2 \ldots \hspace{0.5in}\mbox{(2)}\\
\mbox{Notice that}\hspace{0.02in} \frac{1}{(1-z)^2} = 1 + 2z + 3z^2 \dots \hspace{0.5in} \mbox{(3)}\\
\rightarrow\mbox{(2) - (3)} = \frac{1}{(1-z)^2}\ln(\frac{1}{(1-z)}) = (H_1-1) + (2H_2-2)\times z + (3H_3 -3) \times z^2 \ldots \\
\rightarrow \frac{1}{(1-z)^2}\ln(\frac{1}{(1-z)}) \hspace{0.02in} \mbox{is the O.G.F of sequence} \hspace{0.02in} (n+1)H_{n+1} - (n+1) = (n+1)(H_{n+1}-1) \hspace{0.5in} \mbox{(4)}\\
\\
\mbox{Applying the result of the partial sums to (1) the O.G.F of sequence}\hspace{0.02in} \sum_{k=1}^{n} H_n \hspace{0.02in} \mbox{is} \hspace{0.02in} \frac{1}{(1-z)}\times {\frac{1}{(1-z)}\ln(\frac{1}{1-z})} = \frac{1}{(1-z)^2}\ln(\frac{1}{(1-z)}) \hspace{0.5in} \mbox{(5)}\\

\mbox{From (4) and (5) the generating functions of both the sequences are same, hence the corresponding terms must be the same}





$$

Simple proof for the O.G.F of partial sums

Let $A(z)$ be the O.G.F of the sequence $S = \{a_0, a_1, a_2 \ldots \}$, then we can derive the O.G.F of the sequence $b_n = \sum_{i=0}^{n} a_i$ (i.e. $B = \{a_0, a_0+a_1, a_0+a_1+a_2 \ldots \}$) as follows.

Consider the sequence $S^1 = \{0 , a_0, a_1, a_2, a_3 \ldots \}$ then O.G.F corresponding to $S^1$ is $0\times z^0 + a_0 \times z + a_1 \times z^2 + a_2 \times z^3 \ldots = z\times A(z)$. Similarly the O.G.F corresponding to the sequence $S^2 = \{0, 0, a_0, a_1, a_3 \ldots \}$ is $z^2 \times A(z)$. Now notice that if we add corresponding terms in the sequences $S^1 , S^2, S^3 \ldots$ then this
results in the sequence $B$. Hence the generating function corresponding to $B$ is $A(z) + z\times A(z) + z^2\times A(z) + z^3\times A(z) \ldots = A(z)\times(1 + z + z^2 + z^3 \ldots) = \frac{A(z)}{1-z}$ $\Box$.

Sunday, June 09, 2013

An iterative method to compute the integer linear combination of Greatest Common Divisor

Euclid's iterative method to compute the GCD is very well known. You might have also heard that $GCD(a,b)$ can be expressed as integer linear combination of $a,b$ (i.e. $GCD(a,b) = \alpha\times a + \beta \times b \,\, \alpha,\beta \in \mathcal{I}$). The following is an iterative method -- I wrote on a flight between AUS -> DFW just for fun -- which computes this integer linear combination. Runs in $O(\log(\max\{a,b\}))$ time.


template
 static inline Unit gcd_integer_linear_combination(const Unit& a, const Unit& b, Unit& alpha, Unit& beta ){
  Unit divd = a, divs = b, remd=0, quot;
  Unit a_divd[2] = {1,0}; //linear combination of the divident//
  Unit a_divs[2] = {0,1}, temp[2] = {0,0}; //linear combination of the divisor//
  bool b_flip = false;
  if(divd < divs){
   divd = b; divs = a;
   b_flip = true;
  }
  //Invariant's: divd = a_divd[0]*a + a_divd[1]*b //
  //      divs = a_divs[0]*a + a_divs[1]*b //
  while(divs){
   remd = divd%divs; quot = (divd-remd)/divs;
   divd = divs; divs = remd;
   temp[0] = a_divs[0]; temp[1] = a_divs[1];
   a_divs[0] = a_divd[0] - (quot * a_divs[0]); 
   a_divs[1] = a_divd[1] - (quot * a_divs[1]);
   a_divd[0] = temp[0]; a_divd[1] = temp[1];
  }
  if(!b_flip){
   alpha = a_divd[0]; beta = a_divd[1];
  }else{
   alpha = a_divd[1]; beta = a_divd[0];
  }
  return divd;
 }

The following are some examples $GCD(a,b) = (\alpha)*a + (\beta)*b$.
PASSED: 2 = (1613)*42 + (-8)*8468 
PASSED: 1 = (-1684)*6335 + (1641)*6501 
PASSED: 5 = (344)*9170 + (-551)*5725 
PASSED: 1 = (-1949)*1479 + (308)*9359 
PASSED: 1 = (967)*6963 + (-1508)*4465 
PASSED: 2 = (858)*5706 + (-601)*8146 
PASSED: 6 = (181)*3282 + (-87)*6828 
PASSED: 2 = (121)*9962 + (-2450)*492 
PASSED: 1 = (-596)*2996 + (919)*1943 
PASSED: 1 = (-2348)*4828 + (2085)*5437 
PASSED: 1 = (283)*2392 + (-147)*4605 
PASSED: 1 = (-61)*3903 + (1546)*154 
PASSED: 1 = (122)*293 + (-15)*2383 
PASSED: 1 = (-1528)*7422 + (1301)*8717 
PASSED: 1 = (615)*9719 + (-604)*9896 
PASSED: 1 = (-304)*5448 + (959)*1727 
PASSED: 1 = (-139)*4772 + (431)*1539 
PASSED: 1 = (493)*1870 + (-93)*9913 
PASSED: 4 = (628)*5668 + (-565)*6300 
PASSED: 1 = (3461)*7036 + (-2461)*9895 
PASSED: 4 = (-60)*8704 + (137)*3812 
PASSED: 1 = (77)*1323 + (-305)*334 
PASSED: 3 = (662)*7674 + (-1089)*4665 
PASSED: 2 = (3)*5142 + (-2)*7712 
PASSED: 1 = (1840)*8254 + (-2211)*6869 
PASSED: 1 = (2027)*5548 + (-1471)*7645 
PASSED: 1 = (929)*2663 + (-897)*2758 
PASSED: 2 = (-301)*38 + (4)*2860 
PASSED: 2 = (-823)*8724 + (737)*9742 
PASSED: 1 = (-3)*7530 + (29)*779 
PASSED: 1 = (-1499)*2317 + (1144)*3036 
PASSED: 1 = (662)*2191 + (-787)*1843 
PASSED: 1 = (10)*289 + (-27)*107 
PASSED: 1 = (4289)*9041 + (-4336)*8943