Skip to content
Snippets Groups Projects
Commit b34e2144 authored by Václav Končický's avatar Václav Končický
Browse files

Dynamization: Semidynamization and worst-case version

parent 156967af
No related branches found
No related tags found
No related merge requests found
......@@ -73,12 +73,12 @@ limit}. We use it to determine if a tree is balanced enough.
$\O(\log_{1/\alpha} n)$.
}
\proof{
\proof
Choose an arbitrary path from the root to a leaf and track the node
sizes. The root has size $n$. Each subsequent node has its size at most
$\alpha n$. Once we reach a leaf, its size is 1. Thus the path can
contain at most $\log_{1/\alpha} n$ edges.
}
\qed
Therefore, we want to keep the nodes balanced between any operations. If any
node becomes unbalanced, we take the highest such node $v$ and rebuild its
......@@ -101,7 +101,7 @@ show this more in detail for insertion.
n)$, with constant factor dependent on $\alpha$.
}
\proof{
\proof
We define a potential as a sum of ``badness'' of all tree nodes. Each node will
contribute by the difference of sizes of its left and right child. To make
sure that perfectly balanced subtrees do not contribute, we clamp difference of
......@@ -132,6 +132,249 @@ contributions stay the same. Thus, the potential decreases by at least
$(2\alpha - 1) \cdot s(v) \in \Theta(s(v))$. By multiplying the potential by a
suitable constant, the real cost $\Theta(s(v))$ of rebuild will be fully
compensated by the potential decrease, yielding zero amortized cost.
\qed
\section{General semidynamization}
Let us have a static data structure $S$. We do not need to know how the data
structure is implemented internally. We would like to use $S$ as a ``black
box'' to build a (semi)dynamic data structure $D$ which supports queries of $S$
but also allows element insertion.
This is not always possible, the data structure needs to support a specific
type of queries answering {\I decomposable search problems}.
\defn{
A {\I search problem} is a mapping $f: U_Q \times 2^{U_X} \to U_R$ where $U_Q$
is an universe of queries, $U_X$ is an universe of elements and $U_R$ is set of
possible answers.
}
\defn{
A search problem is {\I decomposable}, if there exists an operator $\sqcup: U_R
\times U_R$ computable in time $\O(1)$\foot{
The constant time constraint is only needed for a good time complexity of $D$.
If it is not met, the construction will still work correctly. Most practical composable
problems meet this condition.}
such that $\forall A, B \subseteq U_X$, $A \cap B = \emptyset$ and $\forall q
\in U_Q$: $$ f(q, A \cup B) = f(q, A) \sqcup f(q, B).$$
}
\examples
\list{o}
\: Let $X \subseteq {\cal U}$. Is $q \in X$? This is a classic search problem
where universes $U_Q, U_R$ are both set ${\cal U}$ and possible replies are
$U_R = \{\hbox{true}, \hbox{false}\}$. This search problem is decomposable, the
operator $\sqcup$ is a simple binary \alg{or}.
\: Let $X$ be set of points on a plane. For a point $q$, what is the distance
of $q$ and the point $x \in X$ closest to $q$? This is a search problem where
$U_Q = U_X = \R^2$ and $U_R = \R^+_0$. It is also decomposable -- $\sqcup$
returns the minimum.
\: Let $X$ be set of points of a plane. Is $q$ in convex hull of $X$? This
search problem is not decomposable -- it is enough to choose $X = \{a, b\}$ and
$q \notin X$. If $A = \{a\}$ and $B = \{b\}$, both subqueries answer
negatively. However, the query answer is equivalent to whether $q$ is a convex
combination of $a$ and $b$.
\endlist
For a decomposable search problem $f$ we can thus split (decompose) any query
into two queries on disjoint element subsets, compute results on them
separately and then combine them in constant time to the final result. We can
further chain the decomposition on each subset, allowing to decompose the query
into an arbitrary amount of subsets.
We can therefore use multiple data structures $S$ as blocks, and to answer a
query we simply query all blocks, and then combine their answers using
$\sqcup$. We will show this construction in detail.
\subsection{Construction}
First, let us denote a few parameters for the static and dynamic data
structure.
\nota{For a data structure $S$ containing $n$ elements and answering a
decomposable search problem $f$ and the resulting dynamic data structure $D$:}
\tightlist{o}
\: $B_S(n)$ is time complexity of building $S$,
\: $Q_S(n)$ is time complexity of query on $S$,
\: $S_S(n)$ is the space complexity of $S$.
\medskip
\: $Q_D(n)$ is time complexity of query on $D$,
\: $S_D(n)$ is the space complexity of $D$,
\: $\bar I_D(n)$ is {\I amortized} time complexity of insertion to $D$.
\endlist
We assume that $Q_S(n)$, $B_S(n)/n$, $S_S(n)/n$ are all nondecreasing functions.
We decompose the set $X$ into blocks $B_i$ such that $|B_i| \in \{0, 2^i\}$
such that $\bigcup_i B_i = X$ and $B_i \cap B_j = \emptyset$ for all $i \neq
j$. Let $|X| = n$. Since $n = \sum_i n_i 2^i$, its binary representation
uniquely determines the block structure. Thus, the total number of blocks is at
most $\log n$.
For each nonempty block $B_i$ we build a static structure $S$ of size $2^i$.
Since $f$ is decomposable, a query on the structure will run queries on each
block, and then combine them using $\sqcup$:
$$ f(q, x) = f(q, B_0) \sqcup f(q, B_1) \sqcup \dots \sqcup f(q, B_i).$$
TODO image
\lemma{$Q_D(n) \in \O(Q_s(n) \cdot \log n)$.}
\proof
Let $|X| = n$. Then the block structure is determined and $\sqcap$ takes
constant time, $Q_D(n) = \sum_{i: B_i \neq \emptyset} Q_S(2^i) + \O(1)$. Since $Q_S(x)
\leq Q_S(n)$ for all $x \leq n$, the inequality holds.
\qed
Now let us calculate the space complexity of $D$.
\lemma{$S_D(n) \in \O(S_S(n))$.}
\proof
For $|X| = n$ let $I = \{i \mid B_i \neq \emptyset\}$. Then for each $i \in I$
we store a static data structure $S$ with $2^i$ elements contained in this
block. Therefore, $Q_D(n) = \sum_{i \in I} Q_S(2^i)$. Since $S_S(n)$ is
assumed to be nondecreasing,
$$
\sum_{i \in I} Q_S(2^i)
\leq \sum_{i \in I} {Q_S(2^i) \over 2^i} \cdot 2^i
\leq {S_S(n) \over n} \cdot \sum_{i=0}^{\log n} 2^i
\leq {S_S(n) \over n} \cdot n.
$$
\qed
It might be advantageous to store the elements in each block separately so that
we do not have to inspect the static structure and extrace the elements from
it, which may take additional time.
An insertion of $x$ will act like an addition of 1 to a binary number. Let $i$
be the smallest index such that $B_i = \emptyset$. We create a new block $B_i$
with elements $B_0 \cup B_1 \cup \dots \cup B_{i-1} \cup \{x\}$. This new block
has $1 + \sum_{j=0}^{i-1} 2^j = 2^i$ elements, which is the required size for
$B_i$. At last, we remove all blocks $B_0, \dots, B_{i-1}$ and add $B_i$.
TODO image
\lemma{$\bar I_D(n) \in \O(B_S(n)/n \cdot \log n)$.}
\proof{
Since the last creation of $B_i$ there had to be least $2^i$
insertions. Amortized over one element this cost is $B_S(2^i) / 2^i$.
As this function is nondecreasing, we can lower bound it by $B_S(n) /
n$. However, one element can participate in $\log n$ rebuilds during
the structure life. Therefore, each element needs to store up cost $\log n
\cdot B_S(n) / n$ to pay off all rebuilds.
}
\theorem{
Let $S$ be a static data structure answering a decomposable search problem $f$.
Then there exists a semidynamic data structure $D$ answering $f$ with parameters
\tightlist{o}
\: $Q_D(n) \in \O(Q_S(n) \cdot \log_n)$,
\: $S_D(n) \in \O(S_S(n))$,
\: $\bar I_D(n) \in \O(B_S(n)/n \cdot \log n)$.
\endlist
}
\example
If we use a sorted array using binary search to search elements in a static
set, we can use this technique to create a dynamic data structure for general
sets. It will require $\Theta(n)$ space and the query will take $\Theta(\log^2
n)$ time as we need to binary search in each list. Since building requires
sorting the array, building one requires $\Theta(n \log n)$ and insertion thus
costs $\Theta(\log^2 n)$ amortized time.
We can speed up insertion time. Instead of building the list anew, we can merge
the lists in $\Theta(n)$ time, therefore speeding up insertion to $\O(\log n)$
amortized.
In general, the bound for insertion is not tight. If $B_S(n) =
\O(n^\varepsilon)$ for $\varepsilon > 1$, the logarithmic factor is dominated
and $\bar I_D(n) \in \O(n^\varepsilon)$.
\subsection{Worst-case semidynamization}
So far we have created a data structure that acts well in the long run, but one
insertion can take long time. This may be unsuitable for applications where we
require a low latency. In such cases, we would like that each insertion is fast
even in the worst case.
Our construction can be deamortized for the price that the resulting
semidynamic data structure will be more complicated. We do this by not
constructing the block at once, but decomposing the construction such that on
each operation we do a small amount of work on it until eventually the whole
block is constructed.
However, insertion is not the only operation, we can also ask queries even
during the construction process. Thus we must keep the old structures until the
construction finishes. As a consequence, more than one block of each size may
exist at the same time.
For each rank $i$ let $B_i^0, B_i^1, B_i^2$ be complete blocks participating in
queries. No such block contains a duplicate element and union of all complete
blocks contains the whole set $X$.
Next let $B_i^*$ be a block in construction. Whenever two blocks $B_i^a, B_i^b$
of same rank $i$ meet, we will immidiately start building $B_{i+1}^*$ using
elements from $B_i^a \cup B_i^b$.
This construction will require $2^{i+1}$
steps until $B_{i+1}^*$ is finished, allocating enough time for each step. Once
we finish $B_{i+1}^*$, we add it to the structure as one of the three full
blocks and finally remove $B_i^a$ and $B_i^b$.
We will show that, using this scheme, this amount of blocks is enough to
bookkeep the structure.
\lemma{
At any point of the structure's life, for each rank $i$, there are at most
three finished blocks and at most one block in construction.
}
\proof
For an empty structure, this certainly holds.
Consider a situation when two blocks $B_i^0$ and $B_i^1$ meet and $B_i^1$ has
just been finalized. Then we start constructing $B_{i+1}^*$. $2^{i+1}$ steps
later $B_{i+1}$ is added and blocks $B_i^0$, $B_i^1$ are removed.
There may appear a new block $B_i^2$ earlier. However, this can only happen
$2^i$ steps later. For the fourth block $B_i^3$ to appear, another $2^i$ steps
are required. The earliest time is then $2 \cdot 2^i = 2^{i+1}$ steps later,
during which $B_{i+1}^*$ has been already finalized, leaving at most two blocks
together and no block of rank $i+1$ in construction.
\qed
An insertion is now done by simply creating new block $B_0$. Next, we
additionaly run one step of construction for each $B_j^*$. There may be up to
$\log n$ blocks in construction.
\theorem{
Let $S$ be a static data structure answering a decomposable problem $f$. Then
there exists semidynamic structure with parameters
\tightlist{o}
\: $Q_D(n) \in \O(Q_S(n) \cdot \log_n)$,
\: $S_D(n) \in \O(S_S(n))$,
\: $I_D(n) \in \O(B_S(n)/n \cdot \log n)$ worst-case.
\endlist
}
\proof
Since there is now a constant amount of blocks of each rank, the query time and
space complexities have increased by a constant compared to previous
technique.
Each insertion builds a block of size 1 and then runs up to $\log n$
construction steps, each taking $B_S(2^i)/2^i$ time. Summing this
together, we get the required upper bound.
\qed
\endchapter
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment