## Tuesday, July 11, 2017

### A matrix formulation of extended Euclid algorithm.

I have written about Extended Euclid Algorithm in the past. However more recently while solving a problem I had to express it in a matrix form which I thought is something worthwhile.

Recall the problem statement: Given two non-zero integers $a,b$ (assume $a\geq b$), the problem is to determine integers $x,y$ such that $\mbox{gcd}(a,b) = a\times x + b \times y$. Notice that the original Euclid's algorithm can compute the gcd but not the coefficients $x,y$ whose linear combination yields the gcd. The idea here is to use the quotient instead of the remainder as in the original algorithm. We express the operands in each iteration a linear combination of the original inputs and use the following to update as follows. The quotient in $i^{th}$ iteration is denoted by $q_i$ leading to the following formulation.

$$\begin{array}{cc} \mbox{gcd}(a,b) = \mbox{ExGcd}(\left[\begin{array}{c} a \\ b \end{array}\right]) &\\ \left [ \begin{array}{cc} x_{0,0} & y_{0,0}\\ x_{1,0} & y_{1,0} \end{array} \right ] = \left [ \begin{array}{cc}1 & 0\\ 0 & 1\end{array} \right ] & \mbox{initialization } i=0\\ \mbox{ExGcd}(\left [\begin{array}{cc} a & b \end{array}\right ] \left [ \begin{array}{cc} x_{0,i} & y_{0,i} \\ x_{1.i} & y_{1,i} \end{array} \right]) = \mbox{ExGcd}(\left[\begin{array}{cc} a & b \end{array}\right] \left [ \begin{array}{cc} x_{0,i-1} & y_{0,i-1} \\ x_{1,i-1} & y_{1,i-1} \end{array} \right ] \left [ \begin{array}{cc} 0 & 1 \\ 1 & -q_{i-1} \end{array} \right ]) & i > 0 \\ q_i = \mbox{Quotient}(x_{0,i}a+x_{1,i}b, y_{0,i}a+y_{1,i}b) & \\ \end{array}$$

Following is an implementation using std::div:
// Input: a,b
// Output: x, y (y0,y1)
void ExtendedEuclid(long a, long b, long& y0, long& y1)
{
long x0, x1, t0, t1;

if (a < b) { std::swap(a, b); }
// Initialize:
x0 = 1; x1 = 0; // a = 1 * a + 0 * b
y0 = 0; y1 = 1; // b = 0 * a + 1 * b

std::ldiv_t div_res = std::div(a, b);
while (div_res.rem) {
t0 = y0; t1 = y1;

y0 = x0 - (div_res.quot * y0);
y1 = x1 - (div_res.quot * y1);
x0 = t0; x1 = t1;

// Invariant: (x0*a+x1*b >= y0*a+y1*b)
div_res = std::div(x0*a+x1*b, y0*a+y1*b);
}
}


Following are few examples obtained by calling ExtendedEuclid(a,b,x,y):
a=25375,b=667 x=1, y=-38, gcd=29
29 = 25375 * 1 + 667*38

a=17765,b=13481 x=-343, y=452, gcd=17
17 = 17765 * -343 + 13481 * 452

a=19109,b=18321 x=-23, y=24, gcd=197
197 = 19109 * -23 + 18321 * 24



## Sunday, May 28, 2017

### Efficient polygon area computation with less multiplications.

Linear algebra computations typically try to prefer mathematical formulations which preserve numerical precision, reduce multiplications and avoid divisions if possible. More recently I came across an elegant formulation of polygon area computation with half the number of multiplications relative to the standard method. I will try to briefly derive it for posterity.

Standard Formulation: Let $P = [v_0, v_1, v_2, v_3\ldots v_{n-1}, v_{n}=v_0]$ be a polygon with $n$ vertices, $v_i = (x_i,y_i)$. Recall the standard formula for the area of the polygon $A(P) = \sum_{i=0}^{n-1} (x_{i}y_{i+1} - y_{i}x_{i+1})$. Every term $T(i) = (x_{i}y_{i+1} - y_{i}x_{i+1})$ corresponds to the edge $(v_i, v_{i+1})$ of the polygon -- more precisely this is the signed area of the triangle ${(0,0), (x_i,y_i), (x_{i+1}, y_{i+1})}$. Clearly computation of $A(P) = \sum_{i=0}^{n-1} T(i)$ needs exactly $2n$ multiplication operations.

Alternate Formulation: Consider $T(i) = x_{i}y_{i+1} - y_{i}x_{i+1}$, now add $x_iy_i$ and subtract $x_{i+1}y_{i+1}$ resulting in the following:
$$\begin{array}{lll} D(i) &=& x_iy_i + T(i) - x_{i+1}y_{i+1} \\ D(i) + D(i+1) &=& x_iy_i + T(i) - x_{i+1}y_{i+1} + x_{i+1}y_{i+1} + T(i+1) - x_{i+2}y_{i+2} \\ &=& x_iy_i + T(i) + T(i+1) - x_{i+2}y_{i+2}\\ \sum_{i=0}^{n-1}D(i) &=& x_{0}y_{0} + \sum_{i=0}^{n-1} T(i) - x_ny_n\,\, \mbox{notice that}\, (x_0,y_0) = (x_n,y_n)\\ &=& \sum_{i=0}^{n-1} T(i) \end{array}$$

Whats special about $D(i) = x_iy_i + T(i) - x_{i+1}y_{i+1}$ ?. Notice that $D(i)$ can be written as the following:
$$\begin{array}{lll} D(i) &=& x_iy_i + T(i) - x_{i+1}y_{i+1} \\ &=& x_iy_i + (x_iy_{i+1} - x_{i+1}y_i) - x_{i+1}y_{i+1} \\ &=& (x_i + x_{i+1})(y_i - y_{i+1}) \,\, \mbox{notice the only one multiplication}. \end{array}$$

Finally we state that the area of the polygon is equivalent to $A(P) = \sum_{i=0}^{n-1} (x_i+x_{i+1})(y_i - y_{i+1})$ and needs exactly $n$ multiplications instead of $2n$ with standard formulation.

## Friday, January 20, 2017

### An Efficient Algorithm to Determine the Order of a Permutation.

Consider at set $S = \{1,2,3\ldots n\}$ and recall that $S_n$ a set of all permutations of $S$ form a group under composition (<$S_n, *$>). E.g. $S=\{1,2,3,4\}$ , $\pi_1 = [3\,4\,2\,1]$ and $\pi_2 = [4\,1\,3\,2]$. To understand composition operation it may be easier to look at a permutation $\pi \in S_n$ as bijective function $\pi : S \rightarrow S$. The composition $\pi_1 * \pi_2$ would be equivalent to $\pi_1(\pi_2(i)), \forall i\in S$ which results in $\pi_1 * \pi_2 = [1\,3\,2\,4]$. We use the notation $\pi^k = \pi*\pi*\pi\ldots *\pi$ to denote a composition applied $k$ times. Since <$S_n,*$> is a group it must have an identity element $e$ which is a permutation such that $e(i) = i, \forall i\in S$.

Now lets focus on the following question:
Given any permutation $\pi \in S_n$ is there an integer $k \geq 0$ such that $\pi^k = e$ ?

Lets try few examples with $\pi_1 = [3\, 4\, 2\, 1]$, $\pi_1^2 = [2\, 1\, 4\, 3]$, $\pi_1^3 = [4\, 3\, 1\, 2]$, $\pi_1^4 = [1\, 2\, 3\, 4]$. So for this example $\pi_1$ we were able to find $k=4$ such that $\pi_1^4 = e$. Later we will show that finding such a $k$ is always possible for any permutation $\pi$ and such a $k$ is called the order of the permutation.

Recall that every permutation can be decomposed into one or more disjoint cycles. We call a permutation cyclic if it has exactly one cycle in it. E.g. $\pi_c = [3\, 1\, 4\, 2]$ is a cyclic permutation in $S_4$ with cycle being $1 \rightarrow 3 \rightarrow 3 \rightarrow 4 \rightarrow 2 \rightarrow 1$.

Lemma: If $\sigma \in S_n$ is a cyclic permutation then $\sigma^n = e$

Proof: Since $\sigma$ is cyclic all elements $S$ form a directed cycle with $n$ nodes. Starting at any node $i$ we need to traverse $n$ edges to get back to $i$. This can be algebraically expressed as $\underbrace{\sigma(\sigma(\ldots \sigma(i))\ldots)}_{n\text{-times}} =i$
$\Box$.

The above lemma states that if a permutation is cyclic on $n$ elements then composing it $n$ times on itself results in an identity permutation. Now we are ready to prove our main theorem:

Theorem: $\forall \pi \in S_n$ there exists a $k \geq 0$ such that $\pi^k = e$.

Proof: Let $\pi = \sigma_1\sigma_2\ldots \sigma_m$ be the cycle decomposition of $\pi$. Let $|\sigma_i|$ denote size of cycle. Since each of $\sigma_i$ is cyclic permutation in $S_{|\sigma_i|}$ by the previous lemma $\sigma_i^{|\sigma_i|} = e_{|\sigma_i|}$. Where $e_{|\sigma_i|}$ is an identity permutation in $S_{|\sigma_i|}$. Also notice that any multiple of $|\sigma_i|$ is also going to result in an identity permutation (i.e. for any integer $q \geq 1$ , $\sigma_i^{q|\sigma_i|} = e_{|\sigma_i|}$). Let $t = |\sigma_1|\times |\sigma_2|\times \ldots |\sigma_m|$ then $\pi^t = \sigma_1^t\sigma_2^t\ldots \sigma_m^t$ since $t$ is a multiple of $|\sigma_i|, 1\leq i \leq m$ we have $\pi^t = e_{|\sigma_1|}e_{|\sigma_2|}\ldots e_{|\sigma_m|} = e$
$\Box$.

From the above proof its clear that choosing $k = |\sigma_1|\times |\sigma_2|\ldots |\sigma_m|$ which is a multiple of all cycle lengths guarantees that $\pi^k = e$. However from a computational efficiency purpose we are much better if we let that multiple be the least common multiple of $|\sigma_i|,\, 1\leq i\leq m$. Following is an efficient algorithm -- linear time and takes at most $n$ l.c.m computations -- to find the order of any permutation:

// Finds the order of a valid permutation ( 1 \leq perm[i] \leq n).
size_t FindPermutationOrder(const std::vector< unsigned int > &perm) {
size_t n = perm.size();
size_t perm_order = 1; // This overflows if the l.c.m does not fit in 64-bits.

// Find the disjoint cycle lengths in the permutation.
size_t cycle_index = 0;
while (cycle_index < n) {
// Find the start of the next disjoint cycle.
size_t i = cycle_index;
size_t cycle_length = 0;
do {
i = perm[i] - 1;
cycle_length++;
} while (i != cycle_index);
perm_order = boost::math::lcm(perm_order, cycle_length);
}
cycle_index++;
}
return perm_order;
}

Following are few examples on the algorithm:
Order of [  5  3  1  2  4  ] is 5
Order of [  3  4  2  1  ] is 4
Order of [  1  2  3  4  5  ] is 1
Order of [  7  8  9  10  1  2  4  3  6  5  ] is 5