Skip to content
Snippets Groups Projects
Commit 156967af authored by Václav Končický's avatar Václav Končický
Browse files

Dynamization: rebuilding and BB[\alpha] trees

parent 28b0254f
No related branches found
No related tags found
No related merge requests found
......@@ -5,4 +5,133 @@
\chapter[dynamic]{Dynamization}
A data structure can be, depending on what operations are supported:
\list{o}
\: {\I static} if all operations after building the structure do not alter the
data,
\: {\I semidynamic} if data insertion is possible as an operation,
\: {\I fully dynamic} if deletion of inserted data is allowed along with insertion.
\endlist
In this chapter we will look at techniques of {\I dynamization} --
transformation of a static data structure into a (semi)dynamic data structure.
\section{Structure rebuilding}
Consider a data structure with $n$ elements such that modifying it may cause
severe problems that are too hard to fix easily. Therefore, we give up on
fixing it and rebuild it completely anew. If we do this after $\Theta(n)$
operations, we can amortize the cost of rebuild into those operations. Let us
look at such cases.
An array is a structure with limited capacity $c$. While it is dynamic (we can
insert or remove elements from the end), we cannot insert new elements
indefinitely. Once we run out of space, we build a new structure with capacity
$2c$ and elements from the old structure.
Since we insert at least $\Theta(n)$ elements to reach the limit from a freshly
rebuilt structure, this amortizes to $\O(1)$ amortized time per an insertion,
as we can rebuild an array in time $\O(n)$.
Another example of such structure is an $y$-fast tree. It is parametrized by
block size required to be $\Theta(\log n)$ for good time complexity. If we let
$n$ change enough such that $\log n$ changes asymptotically, everything breaks.
We can save this by rebuilding the tree before this happens $n$ changes enough,
which happens after $\Omega(n)$ operations.
Consider a data structure where instead of proper deletion of elements we just
replace them with ``tombstones''. When we run a query, we ignore them. After
enough deletions, most of the structure becomes filled with tombstones, leaving
too little space for proper elements and slowing down the queries. Once again,
the fix is simple -- once at least $n/2$ of elements are tombstones, we rebuild
the structure. To reach $n/2$ tombstones we need to delete $\Theta(n)$
elements. If a rebuild takes $\Theta(n)$ time, this again amortizes.
\subsection{Local rebuilding}
In many cases, it is enough to rebuild just a part of the structure to fix
local problems. Once again, if a structure part has size $k$, we want to have
done at least $\Theta(k)$ operations since its last rebuild. This then allows
the rebuild to amortize into other operations.
One of such structures is a binary search tree. We start with a perfectly
balanced tree. As we insert or remove nodes, the tree structure degrades over
time. With a particular choice of operations, we can force the tree to
degenerate into a long vine, having linear depth.
To fix this problem, we define a parameter $1/2 < \alpha < 1$ as a {\I balance
limit}. We use it to determine if a tree is balanced enough.
\defn{
A node $v$ is balanced, if for each its child $c$ we have $s(c) \leq
\alpha s(v)$. A tree $T$ is balanced, if all its nodes are balanced.
}
\lemma{
If a tree with $n$ nodes is balanced, then its height is
$\O(\log_{1/\alpha} n)$.
}
\proof{
Choose an arbitrary path from the root to a leaf and track the node
sizes. The root has size $n$. Each subsequent node has its size at most
$\alpha n$. Once we reach a leaf, its size is 1. Thus the path can
contain at most $\log_{1/\alpha} n$ edges.
}
Therefore, we want to keep the nodes balanced between any operations. If any
node becomes unbalanced, we take the highest such node $v$ and rebuild its
subtree $T(v)$ into a perfectly balanced tree.
For $\alpha$ close to $1/2$ any balanced tree closely resembles a perfectly
balanced tree, while with $\alpha$ close to 1 the tree can degenerate much
more. This parameter therefore controls how often we cause local rebuilds
and the tree height. The trees defined by this parameter are called
$BB[\alpha]$ trees.
Rebuilding a subtree $T(v)$ takes $\O(s(v))$ time, but we can show that this
happens infrequently enough. Both insertion and deletion change the amount of
nodes by one. To inbalance a root of a perfectly balanced trees, and thus cause
a rebuild, we need to add or remove at least $\Theta(n)$ vertices. We will
show this more in detail for insertion.
\theorem{
Amortized time complexity of the \alg{Insert} operation is $\O(\log
n)$, with constant factor dependent on $\alpha$.
}
\proof{
We define a potential as a sum of ``badness'' of all tree nodes. Each node will
contribute by the difference of sizes of its left and right child. To make
sure that perfectly balanced subtrees do not contribute, we clamp difference of
1 to 0.
$$\eqalign{
\Phi &:= \sum_v \varphi(v), \quad\hbox{where} \cr
\varphi(v) &:= \cases{
\left\vert s(\ell(v)) - s(r(v)) \right\vert & if at least~2, \cr
0 & otherwise. \cr
} \cr
}$$
When we add a new leaf, the size of all nodes on the path to the root increases
by 1. The contribution to the potential is therefore at most 2.
We spend $\O(\log n)$ time on the operation. If all nodes stay balanced and
thus no rebuild takes place, potential increases by $\O(\log n)$, resulting in
amortized time $\O(\log n)$.
Otherwise, consider the highest unbalanced node $v$. Without loss of
generality, the invariant was broken for its left child $l(v)$, thus
$s(l(v)) > \alpha \cdot s(v)$. Therefore, the size of the other child is small:
$s(r(v)) < (1 - \alpha) \cdot s(v)$. The contribution of $v$ is therefore
$\varphi(v) > (2\alpha - 1) \cdot s(v)$.
After rebuilding $T(v)$, the subtree becomes perfectly balanced. Therefore for
all nodes $u \in T(v)$ the contribution $\varphi(u)$ becomes zero. All other
contributions stay the same. Thus, the potential decreases by at least
$(2\alpha - 1) \cdot s(v) \in \Theta(s(v))$. By multiplying the potential by a
suitable constant, the real cost $\Theta(s(v))$ of rebuild will be fully
compensated by the potential decrease, yielding zero amortized cost.
}
\endchapter
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment