Tuesday, March 10, 2009

[TECH] Maintaining a Load Factor (α) in Dynamic Data structures with constant (Θ(1)) amortized cost per operation.

Accounting Method is an elegant technique for amortized analysis of the data structures. Most of us should have encountered Dynamic data structures (which need to expand and contract to maintain some invariant) at some point of time. A stack implementation is one example of a dynamic data structure. As we perform pop operation on the stack eventually it become full (over-flow), at this stage we might choose to increase (may be by doubling or some other heuristic) the size of the stack. We also want to handle the under-flows i.e. if the number of elements in the stack is too small we might want to relocate the elements into a new stack ( with smaller size ). In general the dynamic data structures have a parameter called the load factor $\alpha$ and our aim is to maintain that. One simple way to maintain this is to handle under-flows and over-flows on demand. Unfortunately this will not give us a $\Theta(1)$ amortized cost per operation on the stack. However we can still get $\Theta(1)$ amortized cost by avoiding this on demand expansion and compaction.

Let $N_i$ denote the number of elements present on the stack after $i$ operations (including push and pop). Let $S_i$ denote the allocated size of the stack after $i$ operations. We Define a potential energy function $\Phi$ as follows.

\begin{displaymath}
\Phi(i) = 2\times N_i - S_i
\end{displaymath}

Clearly $\Phi(0) = 0$. The amortized cost of the push operation which does not require an expansion of the stack is as follows.

\begin{displaymath}
\begin{array}{rcl}
& = & \mbox{actual cost} + \mbox{potenti...
...s N_i - S_i - (2\times (N_i - 1) - S_i) \\
&=& 3
\end{array}\end{displaymath}

The amortized cost of the push operation which requires doubling of the array is as follows.

\begin{displaymath}
\begin{array}{rcl}
&=& \mbox{cost of copying} + \mbox{poten...
... S_i/2) \\
&=& S_i/2 + (0) - (S_i/2-2) \\
&=& 2
\end{array}\end{displaymath}

The amortized cost of the pop

\begin{displaymath}
\begin{array}{rcl}
c^{3}_{i} &=& 1 + (2\times (N_i - 1) - S_i) - (2\times N_i - S_i) \\
&=& -1
\end{array}\end{displaymath}

From the above we can see that that amortized cost of all the operations is $O(1)$ when we use the doubling heuristic for the stack.

(b)

Let the load factor $\alpha = N_i/S_i$ and let $1-c = 1/2k, k\in \mathbb{R}, c \in (0,1)$. We use the following heuristic to handle expansion and compaction of the array.

"When $\alpha=1$ (i.e. the array is full) increase the size of the array by $k$ times. When $\alpha \leq 1/2k$ (i.e. the array about to under flow) reduce the size of the array by half ($1/2$)".

We define the following potential function and finally show that the amortized complexity of each of the operations push, pop is $O(1)$.

\begin{displaymath}
\begin{array}{rcl}
\Phi(i) &=& \left\{ \begin{array}{rl}
k\...
...1/2k \leq \alpha < 1/k$} \\
\end{array}
\right.
\end{array}\end{displaymath}

We now prove that the amortized cost of each operation is bounded by a constant. From the previous problem (since the potential function is same as the previous problem when $\alpha \geq 1/k$) we can observe that push operation when $\alpha=1$ (expansion) has an amortized cost of $2$. We can also observe that whenever expansion does not occur the amortized cost of the push operation is $3$. So we will concentrate on the other possible situations. We use the notation $\alpha_i$ to indicate the load factor of the array after $i$ operations on the array.

Amortized cost of push when $ 1/2k \leq \alpha_i < 1/k$


\begin{displaymath}
\begin{array}{rcl}
&=& \mbox{actual cost} + \mbox{potential...
...+ (S_i/k - N_i) - (S_{i}/k - (N_{i} -1)) \\
&=& 0
\end{array}\end{displaymath}

Amortized cost of push when $\alpha_i \geq 1/k$, $\alpha_{i-1} < 1/k$


\begin{displaymath}
\begin{array}{rcl}
&=& \mbox{actual cost} + \mbox{potential...
...}{k} S_{i-1} - \frac{k+1}{k} S_{i-1} \\
&<& (k+1)
\end{array}\end{displaymath}

Amortized cost of pop when $1/2k < \alpha_{i},\alpha_{i-1} < 1/k$


\begin{displaymath}
\begin{array}{rcl}
&=& \mbox{actual cost} + \mbox{potential...
...) - (S_{i-1}/k - N_{i-1}) \\
&=& 1 + 1 \\
&=& 2
\end{array}\end{displaymath}

Amortized cost of pop when $\alpha_{i},\alpha_{i-1} \geq 1/k$


\begin{displaymath}
\begin{array}{rcl}
&=& \mbox{actual cost} + \mbox{potential...
...S_{i-1}) - (k\times N_{i-1} - S_{i-1}) \\
&=& 1-k
\end{array}\end{displaymath}

Amortized cost of pop when $1/2k < \alpha_{i} < 1/k$ and $\alpha_{i-1} \geq 1/k$


\begin{displaymath}
\begin{array}{rcl}
&=& \mbox{actual cost} + \mbox{potential...
...ac{k+1}{k} S_{i} - \frac{k+1}{k} S_{i} \\
&<& k+1
\end{array}\end{displaymath}

Amortized cost of pop when $\alpha_{i-1} \leq 1/2k$ with compaction

If the $i^th$ operation is a pop and we find that the load factor $\alpha \leq 1/2k$ then we apply compaction. Compaction reduces the size of the allocated array by $1/2$. So after compaction $\alpha \geq 2(1/2k) = 1/k$. The amortized cost of the pop and compaction is as follows.

\begin{displaymath}
\begin{array}{rcl}
&=& \mbox{copying cost} + \mbox{potentia...
.../2k\times S_{i-1} - k \\
&\leq& -k \\
&<& 1 \\
\end{array}\end{displaymath}

We have proved that the amortized cost of both push and pop with all possible values of the load factor $\alpha$ to be $O(1)$.

No comments: