Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
D
ds2-notes
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Model registry
Analyze
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
datovky
ds2-notes
Commits
b34e2144
Commit
b34e2144
authored
4 years ago
by
Václav Končický
Browse files
Options
Downloads
Patches
Plain Diff
Dynamization: Semidynamization and worst-case version
parent
156967af
No related branches found
No related tags found
No related merge requests found
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
vk-dynamic/dynamic.tex
+250
-7
250 additions, 7 deletions
vk-dynamic/dynamic.tex
with
250 additions
and
7 deletions
vk-dynamic/dynamic.tex
+
250
−
7
View file @
b34e2144
...
...
@@ -73,12 +73,12 @@ limit}. We use it to determine if a tree is balanced enough.
$
\O
(
\log
_{
1
/
\alpha
}
n
)
$
.
}
\proof
{
\proof
Choose an arbitrary path from the root to a leaf and track the node
sizes. The root has size
$
n
$
. Each subsequent node has its size at most
$
\alpha
n
$
. Once we reach a leaf, its size is 1. Thus the path can
contain at most
$
\log
_{
1
/
\alpha
}
n
$
edges.
}
\qed
Therefore, we want to keep the nodes balanced between any operations. If any
node becomes unbalanced, we take the highest such node
$
v
$
and rebuild its
...
...
@@ -101,7 +101,7 @@ show this more in detail for insertion.
n
)
$
, with constant factor dependent on
$
\alpha
$
.
}
\proof
{
\proof
We define a potential as a sum of ``badness'' of all tree nodes. Each node will
contribute by the difference of sizes of its left and right child. To make
sure that perfectly balanced subtrees do not contribute, we clamp difference of
...
...
@@ -132,6 +132,249 @@ contributions stay the same. Thus, the potential decreases by at least
$
(
2
\alpha
-
1
)
\cdot
s
(
v
)
\in
\Theta
(
s
(
v
))
$
. By multiplying the potential by a
suitable constant, the real cost
$
\Theta
(
s
(
v
))
$
of rebuild will be fully
compensated by the potential decrease, yielding zero amortized cost.
\qed
\section
{
General semidynamization
}
Let us have a static data structure
$
S
$
. We do not need to know how the data
structure is implemented internally. We would like to use
$
S
$
as a ``black
box'' to build a (semi)dynamic data structure
$
D
$
which supports queries of
$
S
$
but also allows element insertion.
This is not always possible, the data structure needs to support a specific
type of queries answering
{
\I
decomposable search problems
}
.
\defn
{
A
{
\I
search problem
}
is a mapping
$
f: U
_
Q
\times
2
^{
U
_
X
}
\to
U
_
R
$
where
$
U
_
Q
$
is an universe of queries,
$
U
_
X
$
is an universe of elements and
$
U
_
R
$
is set of
possible answers.
}
\defn
{
A search problem is
{
\I
decomposable
}
, if there exists an operator
$
\sqcup
: U
_
R
\times
U
_
R
$
computable in time
$
\O
(
1
)
$
\foot
{
The constant time constraint is only needed for a good time complexity of
$
D
$
.
If it is not met, the construction will still work correctly. Most practical composable
problems meet this condition.
}
such that
$
\forall
A, B
\subseteq
U
_
X
$
,
$
A
\cap
B
=
\emptyset
$
and
$
\forall
q
\in
U
_
Q
$
:
$$
f
(
q, A
\cup
B
)
=
f
(
q, A
)
\sqcup
f
(
q, B
)
.
$$
}
\examples
\list
{
o
}
\:
Let
$
X
\subseteq
{
\cal
U
}$
. Is
$
q
\in
X
$
? This is a classic search problem
where universes
$
U
_
Q, U
_
R
$
are both set
${
\cal
U
}$
and possible replies are
$
U
_
R
=
\{\hbox
{
true
}
,
\hbox
{
false
}
\}
$
. This search problem is decomposable, the
operator
$
\sqcup
$
is a simple binary
\alg
{
or
}
.
\:
Let
$
X
$
be set of points on a plane. For a point
$
q
$
, what is the distance
of
$
q
$
and the point
$
x
\in
X
$
closest to
$
q
$
? This is a search problem where
$
U
_
Q
=
U
_
X
=
\R
^
2
$
and
$
U
_
R
=
\R
^
+
_
0
$
. It is also decomposable --
$
\sqcup
$
returns the minimum.
\:
Let
$
X
$
be set of points of a plane. Is
$
q
$
in convex hull of
$
X
$
? This
search problem is not decomposable -- it is enough to choose
$
X
=
\{
a, b
\}
$
and
$
q
\notin
X
$
. If
$
A
=
\{
a
\}
$
and
$
B
=
\{
b
\}
$
, both subqueries answer
negatively. However, the query answer is equivalent to whether
$
q
$
is a convex
combination of
$
a
$
and
$
b
$
.
\endlist
For a decomposable search problem
$
f
$
we can thus split (decompose) any query
into two queries on disjoint element subsets, compute results on them
separately and then combine them in constant time to the final result. We can
further chain the decomposition on each subset, allowing to decompose the query
into an arbitrary amount of subsets.
We can therefore use multiple data structures
$
S
$
as blocks, and to answer a
query we simply query all blocks, and then combine their answers using
$
\sqcup
$
. We will show this construction in detail.
\subsection
{
Construction
}
First, let us denote a few parameters for the static and dynamic data
structure.
\nota
{
For a data structure
$
S
$
containing
$
n
$
elements and answering a
decomposable search problem
$
f
$
and the resulting dynamic data structure
$
D
$
:
}
\tightlist
{
o
}
\:
$
B
_
S
(
n
)
$
is time complexity of building
$
S
$
,
\:
$
Q
_
S
(
n
)
$
is time complexity of query on
$
S
$
,
\:
$
S
_
S
(
n
)
$
is the space complexity of
$
S
$
.
\medskip
\:
$
Q
_
D
(
n
)
$
is time complexity of query on
$
D
$
,
\:
$
S
_
D
(
n
)
$
is the space complexity of
$
D
$
,
\:
$
\bar
I
_
D
(
n
)
$
is
{
\I
amortized
}
time complexity of insertion to
$
D
$
.
\endlist
We assume that
$
Q
_
S
(
n
)
$
,
$
B
_
S
(
n
)/
n
$
,
$
S
_
S
(
n
)/
n
$
are all nondecreasing functions.
We decompose the set
$
X
$
into blocks
$
B
_
i
$
such that
$
|B
_
i|
\in
\{
0
,
2
^
i
\}
$
such that
$
\bigcup
_
i B
_
i
=
X
$
and
$
B
_
i
\cap
B
_
j
=
\emptyset
$
for all
$
i
\neq
j
$
. Let
$
|X|
=
n
$
. Since
$
n
=
\sum
_
i n
_
i
2
^
i
$
, its binary representation
uniquely determines the block structure. Thus, the total number of blocks is at
most
$
\log
n
$
.
For each nonempty block
$
B
_
i
$
we build a static structure
$
S
$
of size
$
2
^
i
$
.
Since
$
f
$
is decomposable, a query on the structure will run queries on each
block, and then combine them using
$
\sqcup
$
:
$$
f
(
q, x
)
=
f
(
q, B
_
0
)
\sqcup
f
(
q, B
_
1
)
\sqcup
\dots
\sqcup
f
(
q, B
_
i
)
.
$$
TODO image
\lemma
{$
Q
_
D
(
n
)
\in
\O
(
Q
_
s
(
n
)
\cdot
\log
n
)
$
.
}
\proof
Let
$
|X|
=
n
$
. Then the block structure is determined and
$
\sqcap
$
takes
constant time,
$
Q
_
D
(
n
)
=
\sum
_{
i: B
_
i
\neq
\emptyset
}
Q
_
S
(
2
^
i
)
+
\O
(
1
)
$
. Since
$
Q
_
S
(
x
)
\leq
Q
_
S
(
n
)
$
for all
$
x
\leq
n
$
, the inequality holds.
\qed
Now let us calculate the space complexity of
$
D
$
.
\lemma
{$
S
_
D
(
n
)
\in
\O
(
S
_
S
(
n
))
$
.
}
\proof
For
$
|X|
=
n
$
let
$
I
=
\{
i
\mid
B
_
i
\neq
\emptyset\}
$
. Then for each
$
i
\in
I
$
we store a static data structure
$
S
$
with
$
2
^
i
$
elements contained in this
block. Therefore,
$
Q
_
D
(
n
)
=
\sum
_{
i
\in
I
}
Q
_
S
(
2
^
i
)
$
. Since
$
S
_
S
(
n
)
$
is
assumed to be nondecreasing,
$$
\sum
_{
i
\in
I
}
Q
_
S
(
2
^
i
)
\leq
\sum
_{
i
\in
I
}
{
Q
_
S
(
2
^
i
)
\over
2
^
i
}
\cdot
2
^
i
\leq
{
S
_
S
(
n
)
\over
n
}
\cdot
\sum
_{
i
=
0
}^{
\log
n
}
2
^
i
\leq
{
S
_
S
(
n
)
\over
n
}
\cdot
n.
$$
\qed
It might be advantageous to store the elements in each block separately so that
we do not have to inspect the static structure and extrace the elements from
it, which may take additional time.
An insertion of
$
x
$
will act like an addition of 1 to a binary number. Let
$
i
$
be the smallest index such that
$
B
_
i
=
\emptyset
$
. We create a new block
$
B
_
i
$
with elements
$
B
_
0
\cup
B
_
1
\cup
\dots
\cup
B
_{
i
-
1
}
\cup
\{
x
\}
$
. This new block
has
$
1
+
\sum
_{
j
=
0
}^{
i
-
1
}
2
^
j
=
2
^
i
$
elements, which is the required size for
$
B
_
i
$
. At last, we remove all blocks
$
B
_
0
,
\dots
, B
_{
i
-
1
}$
and add
$
B
_
i
$
.
TODO image
\lemma
{$
\bar
I
_
D
(
n
)
\in
\O
(
B
_
S
(
n
)/
n
\cdot
\log
n
)
$
.
}
\proof
{
Since the last creation of
$
B
_
i
$
there had to be least
$
2
^
i
$
insertions. Amortized over one element this cost is
$
B
_
S
(
2
^
i
)
/
2
^
i
$
.
As this function is nondecreasing, we can lower bound it by
$
B
_
S
(
n
)
/
n
$
. However, one element can participate in
$
\log
n
$
rebuilds during
the structure life. Therefore, each element needs to store up cost
$
\log
n
\cdot
B
_
S
(
n
)
/
n
$
to pay off all rebuilds.
}
\theorem
{
Let
$
S
$
be a static data structure answering a decomposable search problem
$
f
$
.
Then there exists a semidynamic data structure
$
D
$
answering
$
f
$
with parameters
\tightlist
{
o
}
\:
$
Q
_
D
(
n
)
\in
\O
(
Q
_
S
(
n
)
\cdot
\log
_
n
)
$
,
\:
$
S
_
D
(
n
)
\in
\O
(
S
_
S
(
n
))
$
,
\:
$
\bar
I
_
D
(
n
)
\in
\O
(
B
_
S
(
n
)/
n
\cdot
\log
n
)
$
.
\endlist
}
\example
If we use a sorted array using binary search to search elements in a static
set, we can use this technique to create a dynamic data structure for general
sets. It will require
$
\Theta
(
n
)
$
space and the query will take
$
\Theta
(
\log
^
2
n
)
$
time as we need to binary search in each list. Since building requires
sorting the array, building one requires
$
\Theta
(
n
\log
n
)
$
and insertion thus
costs
$
\Theta
(
\log
^
2
n
)
$
amortized time.
We can speed up insertion time. Instead of building the list anew, we can merge
the lists in
$
\Theta
(
n
)
$
time, therefore speeding up insertion to
$
\O
(
\log
n
)
$
amortized.
In general, the bound for insertion is not tight. If
$
B
_
S
(
n
)
=
\O
(
n
^
\varepsilon
)
$
for
$
\varepsilon
>
1
$
, the logarithmic factor is dominated
and
$
\bar
I
_
D
(
n
)
\in
\O
(
n
^
\varepsilon
)
$
.
\subsection
{
Worst-case semidynamization
}
So far we have created a data structure that acts well in the long run, but one
insertion can take long time. This may be unsuitable for applications where we
require a low latency. In such cases, we would like that each insertion is fast
even in the worst case.
Our construction can be deamortized for the price that the resulting
semidynamic data structure will be more complicated. We do this by not
constructing the block at once, but decomposing the construction such that on
each operation we do a small amount of work on it until eventually the whole
block is constructed.
However, insertion is not the only operation, we can also ask queries even
during the construction process. Thus we must keep the old structures until the
construction finishes. As a consequence, more than one block of each size may
exist at the same time.
For each rank
$
i
$
let
$
B
_
i
^
0
, B
_
i
^
1
, B
_
i
^
2
$
be complete blocks participating in
queries. No such block contains a duplicate element and union of all complete
blocks contains the whole set
$
X
$
.
Next let
$
B
_
i
^
*
$
be a block in construction. Whenever two blocks
$
B
_
i
^
a, B
_
i
^
b
$
of same rank
$
i
$
meet, we will immidiately start building
$
B
_{
i
+
1
}^
*
$
using
elements from
$
B
_
i
^
a
\cup
B
_
i
^
b
$
.
This construction will require
$
2
^{
i
+
1
}$
steps until
$
B
_{
i
+
1
}^
*
$
is finished, allocating enough time for each step. Once
we finish
$
B
_{
i
+
1
}^
*
$
, we add it to the structure as one of the three full
blocks and finally remove
$
B
_
i
^
a
$
and
$
B
_
i
^
b
$
.
We will show that, using this scheme, this amount of blocks is enough to
bookkeep the structure.
\lemma
{
At any point of the structure's life, for each rank
$
i
$
, there are at most
three finished blocks and at most one block in construction.
}
\proof
For an empty structure, this certainly holds.
Consider a situation when two blocks
$
B
_
i
^
0
$
and
$
B
_
i
^
1
$
meet and
$
B
_
i
^
1
$
has
just been finalized. Then we start constructing
$
B
_{
i
+
1
}^
*
$
.
$
2
^{
i
+
1
}$
steps
later
$
B
_{
i
+
1
}$
is added and blocks
$
B
_
i
^
0
$
,
$
B
_
i
^
1
$
are removed.
There may appear a new block
$
B
_
i
^
2
$
earlier. However, this can only happen
$
2
^
i
$
steps later. For the fourth block
$
B
_
i
^
3
$
to appear, another
$
2
^
i
$
steps
are required. The earliest time is then
$
2
\cdot
2
^
i
=
2
^{
i
+
1
}$
steps later,
during which
$
B
_{
i
+
1
}^
*
$
has been already finalized, leaving at most two blocks
together and no block of rank
$
i
+
1
$
in construction.
\qed
An insertion is now done by simply creating new block
$
B
_
0
$
. Next, we
additionaly run one step of construction for each
$
B
_
j
^
*
$
. There may be up to
$
\log
n
$
blocks in construction.
\theorem
{
Let
$
S
$
be a static data structure answering a decomposable problem
$
f
$
. Then
there exists semidynamic structure with parameters
\tightlist
{
o
}
\:
$
Q
_
D
(
n
)
\in
\O
(
Q
_
S
(
n
)
\cdot
\log
_
n
)
$
,
\:
$
S
_
D
(
n
)
\in
\O
(
S
_
S
(
n
))
$
,
\:
$
I
_
D
(
n
)
\in
\O
(
B
_
S
(
n
)/
n
\cdot
\log
n
)
$
worst-case.
\endlist
}
\proof
Since there is now a constant amount of blocks of each rank, the query time and
space complexities have increased by a constant compared to previous
technique.
Each insertion builds a block of size 1 and then runs up to
$
\log
n
$
construction steps, each taking
$
B
_
S
(
2
^
i
)/
2
^
i
$
time. Summing this
together, we get the required upper bound.
\qed
\endchapter
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment