Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
D
ds2-notes
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Model registry
Analyze
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
datovky
ds2-notes
Commits
04bd1ec6
Commit
04bd1ec6
authored
3 years ago
by
Jirka Skrobanek
Browse files
Options
Downloads
Patches
Plain Diff
Proof-reading of dynamize chapter
parent
c0f51118
Branches
Branches containing commit
No related tags found
1 merge request
!2
Proof-reading of dynamize chapter
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
51-dynamize/dynamize.tex
+59
-130
59 additions, 130 deletions
51-dynamize/dynamize.tex
with
59 additions
and
130 deletions
51-dynamize/dynamize.tex
+
59
−
130
View file @
04bd1ec6
...
...
@@ -9,44 +9,45 @@ A data structure can be, depending on what operations are supported:
\tightlist
{
o
}
\:
{
\I
static
}
if all operations after building the structure do not alter the
data,
\:
{
\I
semidynamic
}
if data insertion is possible as an operation,
\:
{
\I
fully dynamic
}
if deletion of inserted data is allowed along with insertion.
data
\foot
{
As a side note regarding this terminology, let us remark about the distinction
between an update of the proper data stored inside a data structure and an update of some
auxiliary data. For example a splay tree can change shape even though only queries and no
updates happen.
}
,
\:
{
\I
semi-dynamic
}
if data insertion is possible as an operation,
\:
{
\I
(fully) dynamic
}
if deletion of data is allowed on top of insertion.
\endlist
Static data structures are useful if we know the structure beforehand. In many
cases, static data structures are simpler and faster than their dynamic
alternatives.
Static data structures are often sufficient for many applications where updates are simply not required.
A sorted array is a typical example of a static data structure to store an
ordered set of
$
n
$
elements. Its supported operations are
$
\alg
{
Index
}
(
i
)
$
which simply returns
$
i
$
-th smallest element in constant time, and
$
\alg
{
Find
}
(
x
)
$
which finds
$
x
$
and its index
$
i
$
in the array using binary
search in time
$
\O
(
\log
n
)
$
.
However, if we wish to insert a new element to already existing sorted array,
However, if we wish to insert a new element to
an
already existing sorted array,
this operation will take
$
\Omega
(
n
)
$
-- we must shift the elements to keep
the sorted order. In order to have a fast insertion, we may decide to use a
different dynamic data structure, such as a binary search tree. But then the
operation
\alg
{
Index
}
slows down to logarithmic time.
In this chapter we will look at techniques of
{
\I
dynamization
}
--
different dynamic data structure, a binary search tree (BST) for instance. In which case
the operation
\alg
{
Index
}
slows down to logarithmic time.
What happened to
\alg
{
Index
}
is a frequent inconvenience when we modify data structures to support updates.
Oftentimes making one operation run faster is only possible by making another operation
run slower. One must therefore strike a careful balance among complexities of
individual operations based on how often they are needed. It is the subject
of this chapter to show some efficient techniques of
{
\I
dynamization
}
--
transformation of a static data structure into a (semi)dynamic data structure.
As we have seen with a sorted array, the simple and straight-forward attempts
often lead to slow operations. Therefore, we want to dynamize data structures
in such way that the operations stay reasonably fast.
\section
{
Structure
rebuilding
}
\section
{
Global
rebuilding
}
Consider a data structure with
$
n
$
elements such that modifying it may cause
severe problems that are too hard to fix easily. In
such
case, we give up on
severe problems that are too hard to fix easily. In
that
case, we give up on
fixing it and rebuild it completely anew.
If
building such structure takes time
$
\O
(
f
(
n
)
)
$
and we perform the rebuild
after
$
\Theta
(
n
)
$
modifying operations, we can amortize
the cost of rebuild
into the operations. This adds an amortized factor
$
\O
(
f
(
n
)/
n
)
$
to
their time complexity, given that
$
n
$
does not change
asymptotically between
the rebuilds.
If
initializing the structure with
$
n
$
elements takes
$
f
(
n
)
$
time steps and
we perform the rebuild
after
$
\Theta
(
n
)
$
modifying operations, we can amortize
the cost of rebuild
into the operations. This adds an amortized factor
$
\O
(
f
(
n
)/
n
)
$
to
their time complexity, given that
$
n
$
does not change
asymptotically between
the rebuilds.
\examples
...
...
@@ -55,13 +56,13 @@ the rebuilds.
An array is a structure with limited capacity
$
c
$
. While it is dynamic (we can
insert or remove elements at the end), we cannot insert new elements
indefinitely. Once we run out of space, we build a new structure with capacity
$
2
c
$
and elements from the old structure.
Since we insert at least
$
\Theta
(
n
)
$
elements to reach the limit from a freshly
rebuilt structure, this amortizes to
$
\O
(
1
)
$
amortized
time per
an
insertion,
$
2
c
$
and
copy to it the
elements from the old structure.
Since we insert
ed
at least
$
\Theta
(
n
)
$
elements to reach the limit from a freshly
rebuilt structure, this amortizes to
$
\O
(
1
)
$
time per insertion,
as we can rebuild an array in time
$
\O
(
n
)
$
.
\:
Another example of such structure is a
n
$
y
$
-fast trie. It is parametrized by
Another example of such
a
structure is a
$
y
$
-fast trie. It is parametrized by
block size required to be
$
\Theta
(
\log
n
)
$
for good time complexity. If we let
$
n
$
change enough such that
$
\log
n
$
changes asymptotically, the proven time
complexity no longer holds.
...
...
@@ -70,12 +71,11 @@ changes by a constant factor (then $\log n$ changes by a constant additively).
This happens no sooner than after
$
\Theta
(
n
)
$
insertions or deletions.
\:
Consider a data structure where instead of proper
deletion
of elements
we just
replace them with ``tombstones''. When we run a query
, we ignore them. After
enough deletions, most of the structure becomes filled with tombstones, leaving
Consider a data structure where instead of proper
removal
of elements
on deletion
we just
replace them with ``tombstones''. When we run a query
later, we just ignore tombstones.
After
enough deletions, most of the structure becomes filled with tombstones, leaving
too little space for proper elements and slowing down the queries.
Once again,
the fix is simple -- once at least
$
n
/
2
$
of elements are tombstones, we rebuild
Once again, the idea is simple -- once at least
$
n
/
2
$
of elements are tombstones, we rebuild
the structure. To reach
$
n
/
2
$
tombstones we need to delete
$
\Theta
(
n
)
$
elements.
\endlist
...
...
@@ -83,94 +83,23 @@ elements.
\subsection
{
Local rebuilding
}
In many cases, it is enough to rebuild just a part of the structure to fix
local problems.
Once again, if a
structure
part
has size
$
k
$
, we want to have
done
at least
$
\Theta
(
k
)
$
operations
since its last rebuild. This then allows
the rebuild
to amortize into other operations.
local problems.
If a segment of the
structure has size
$
k
$
, we want to have
space out reconstructions
at least
$
\Theta
(
k
)
$
operations
apart, allowing it
to amortize into other operations.
One of such structures is a
binary search tree. W
e start with a perfectly
balanced tree
. As we insert or
remov
e
nodes, the tree structure degrades over
One of such structures is a
BST. Imagin
e start
ing
with a perfectly
balanced tree
, inserting and
remov
ing
nodes, the tree structure degrades over
time. With a particular choice of operations, we can force the tree to
degenerate into a long vine, having linear depth.
To fix this problem, we define a parameter
$
1
/
2
<
\alpha
<
1
$
as a
{
\I
balance
limit
}
. We use it to determine if a tree is balanced enough.
\defn
{
A node
$
v
$
is balanced, if for each its child
$
c
$
we have
$
s
(
c
)
\leq
\alpha
s
(
v
)
$
. A tree
$
T
$
is balanced, if all its nodes are balanced.
}
\lemma
{
If a tree with
$
n
$
nodes is balanced, then its height is
$
\O
(
\log
_{
1
/
\alpha
}
n
)
$
.
}
\proof
Choose an arbitrary path from the root to a leaf and track the node
sizes. The root has size
$
n
$
. Each subsequent node has its size at most
$
\alpha
n
$
. Once we reach a leaf, its size is 1. Thus the path can
contain at most
$
\log
_{
1
/
\alpha
}
n
$
edges.
\qed
Therefore, we want to keep the nodes balanced between any operations. If any
node becomes unbalanced, we take the highest such node
$
v
$
and rebuild its
subtree
$
T
(
v
)
$
into a perfectly balanced tree.
For
$
\alpha
$
close to
$
1
/
2
$
any balanced tree closely resembles a perfectly
balanced tree, while with
$
\alpha
$
close to 1 the tree can degenerate much
more. This parameter therefore controls how often we cause local rebuilds
and the tree height. The trees defined by this parameter are called
$
BB
[
\alpha
]
$
trees.
Rebuilding a subtree
$
T
(
v
)
$
takes
$
\O
(
s
(
v
))
$
time, but we can show that this
happens infrequently enough. Both insertion and deletion change the amount of
nodes by one. To unbalance a root of a perfectly balanced trees, and thus cause
a rebuild, we need to add or remove at least
$
\Theta
(
n
)
$
vertices. We will
show this more in detail for insertion.
\theorem
{
Amortized time complexity of the
\alg
{
Insert
}
operation is
$
\O
(
\log
n
)
$
, with constant factor dependent on
$
\alpha
$
.
}
\proof
We define a potential as a sum of ``badness'' of all tree nodes. Each node will
contribute by the difference of sizes of its left and right child. To make
sure that perfectly balanced subtrees do not contribute, we clamp difference of
1 to 0.
$$
\eqalign
{
\Phi
&
:
=
\sum
_
v
\varphi
(
v
)
,
\quad\hbox
{
where
}
\cr
\varphi
(
v
)
&
:
=
\cases
{
\left\vert
s
(
\ell
(
v
))
-
s
(
r
(
v
))
\right\vert
&
if at least~
2
,
\cr
0
&
otherwise.
\cr
}
\cr
}$$
When we add a new leaf, the size of all nodes on the path to the root increases
by 1. The contribution to the potential is therefore at most 2.
We spend
$
\O
(
\log
n
)
$
time on the operation. If all nodes stay balanced and
thus no rebuild takes place, potential increases by
$
\O
(
\log
n
)
$
, resulting in
amortized time
$
\O
(
\log
n
)
$
.
Otherwise, consider the highest unbalanced node
$
v
$
. Without loss of
generality, the invariant was broken for its left child
$
l
(
v
)
$
, thus
$
s
(
l
(
v
))
>
\alpha
\cdot
s
(
v
)
$
. Therefore, the size of the other child is small:
$
s
(
r
(
v
))
<
(
1
-
\alpha
)
\cdot
s
(
v
)
$
. The contribution of
$
v
$
is therefore
$
\varphi
(
v
)
>
(
2
\alpha
-
1
)
\cdot
s
(
v
)
$
.
After rebuilding
$
T
(
v
)
$
, the subtree becomes perfectly balanced. Therefore for
all nodes
$
u
\in
T
(
v
)
$
the contribution
$
\varphi
(
u
)
$
becomes zero. All other
contributions stay the same. Thus, the potential decreases by at least
$
(
2
\alpha
-
1
)
\cdot
s
(
v
)
\in
\Theta
(
s
(
v
))
$
. By multiplying the potential by a
suitable constant, the real cost
$
\Theta
(
s
(
v
))
$
of rebuild will be fully
compensated by the potential decrease, yielding zero amortized cost.
\qed
We introduced weight-balanced trees that maintain balance using an algorithm based on this idea
of partial reconstruction in the very first chapter of these lecture notes.
\section
{
General semidynamization
}
\section
{
General semi
-
dynamization
}
Let us have a static data structure
$
S
$
. We do not need to know how the data
structure is implemented internally. We would like to use
$
S
$
as a ``black
box'' to build a (semi)dynamic data structure
$
D
$
which supports queries of
$
S
$
box'' to build a (semi
-
)dynamic data structure
$
D
$
which supports queries of
$
S
$
but also allows element insertion.
This is not always possible, the data structure needs to support a specific
...
...
@@ -178,16 +107,16 @@ type of queries answering {\I decomposable search problems}.
\defn
{
A
{
\I
search problem
}
is a mapping
$
f: U
_
Q
\times
2
^{
U
_
X
}
\to
U
_
R
$
where
$
U
_
Q
$
is a
n
universe of queries,
$
U
_
X
$
is a
n
universe of elements and
$
U
_
R
$
is set of
is a universe of queries,
$
U
_
X
$
is a universe of elements and
$
U
_
R
$
is set of
possible answers.
}
\defn
{
A search problem is
{
\I
decomposable
}
, if there exists an operator
$
\sqcup
: U
_
R
\times
U
_
R
$
computable in time
$
\O
(
1
)
$
\foot
{
The constant time constraint is only needed for a good time complexity of
$
D
$
.
If it is not met, the construction will still work correctly
. Most practical composable
problems meet this condition
.
}
Most practical decomposable problems do meet this condition
.
If it is not met, the construction will still work correctly
,
but the time complexity may increase
.
}
such that
$
\forall
A, B
\subseteq
U
_
X
$
,
$
A
\cap
B
=
\emptyset
$
and
$
\forall
q
\in
U
_
Q
$
:
$$
f
(
q, A
\cup
B
)
=
f
(
q, A
)
\sqcup
f
(
q, B
)
.
$$
}
...
...
@@ -202,7 +131,7 @@ operator $\sqcup$ is a simple binary \alg{or}.
\:
Let
$
X
$
be set of points on a plane. For a point
$
q
$
, what is the distance
of
$
q
$
and the point
$
x
\in
X
$
closest to
$
q
$
? This is a search problem where
$
U
_
Q
=
U
_
X
=
\R
^
2
$
and
$
U
_
R
=
\R
^
+
_
0
$
. It is also decomposable --
$
\sqcup
$
$
U
_
Q
=
U
_
X
=
\R
^
2
$
and
$
U
_
R
$
is the set of non-negative reals
. It is also decomposable --
$
\sqcup
$
returns the minimum.
\:
Let
$
X
$
be set of points of a plane. Is
$
q
$
in convex hull of
$
X
$
? This
...
...
@@ -242,8 +171,8 @@ decomposable search problem $f$ and the resulting dynamic data structure $D$:}
We assume that
$
Q
_
S
(
n
)
$
,
$
B
_
S
(
n
)/
n
$
,
$
S
_
S
(
n
)/
n
$
are all non-decreasing functions.
We
decompose
the set
$
X
$
int
o
blocks
$
B
_
i
$
such that
$
|B
_
i|
\in
\{
0
,
2
^
i
\}
$
,
$
\bigcup
_
i B
_
i
=
X
$
and
$
B
_
i
\cap
B
_
j
=
\emptyset
$
for all
$
i
\neq
j
$
.
Let
$
|X|
=
n
$
. Since
$
n
=
\sum
_
i n
_
i
2
^
i
$
for
$
n
_
i
\in
\{
0
,
1
\}
$
, its
We
cover
the set
$
X
$
by pair-wise disjo
int blocks
$
B
_
i
$
such that
$
|B
_
i|
\in
\{
0
,
2
^
i
\}
$
.
Let
$
|X|
=
n
$
. Since
$
n
=
\sum
_
i n
_
i
2
^
i
$
for
$
n
_
i
\in
\{
0
,
1
\}
$
, its
binary representation uniquely determines the block structure. Thus, the total
number of blocks is at most
$
\log
n
$
.
...
...
@@ -252,7 +181,7 @@ Since $f$ is decomposable, a query on the structure will run queries on each
block, and then combine them using
$
\sqcup
$
:
$$
f
(
q, x
)
=
f
(
q, B
_
0
)
\sqcup
f
(
q, B
_
1
)
\sqcup
\dots
\sqcup
f
(
q, B
_
i
)
.
$$
\lemma
{$
Q
_
D
(
n
)
\in
\O
(
Q
_
s
(
n
)
\cdot
\log
n
)
$
.
}
\lemma
{$
Q
_
D
(
n
)
\in
\O
(
Q
_
S
(
n
)
\cdot
\log
n
)
$
.
}
\proof
Let
$
|X|
=
n
$
. Then the block structure is determined and
$
\sqcup
$
takes
...
...
@@ -302,7 +231,7 @@ unchanged.}
\theorem
{
Let
$
S
$
be a static data structure answering a decomposable search problem
$
f
$
.
Then there exists a semidynamic data structure
$
D
$
answering
$
f
$
with parameters
Then there exists a semi
-
dynamic data structure
$
D
$
answering
$
f
$
with parameters
\tightlist
{
o
}
\:
$
Q
_
D
(
n
)
\in
\O
(
Q
_
S
(
n
)
\cdot
\log
_
n
)
$
,
...
...
@@ -328,17 +257,17 @@ We can speed up insertion time. Instead of building the list anew, we can merge
the lists in
$
\Theta
(
n
)
$
time, therefore speeding up insertion to
$
\O
(
\log
n
)
$
amortized.
\subsection
{
Worst-case semidynamization
}
\subsection
{
Worst-case semi
-
dynamization
}
So far we have created a data structure that acts well in the long run, but one
insertion can take long time. This may be unsuitable for applications where we
require
a
low latency. In such cases, we would like that each insertion is fast
insertion can take
a
long time. This may be unsuitable for applications where we
require low latency. In such cases, we would like that each insertion is fast
even in the worst case.
Our construction can be deamortized for the price that the resulting
semidynamic data structure will be more complicated. We do this by not
semi
-
dynamic data structure will be more complicated. We do this by not
constructing the block at once, but decomposing the construction such that on
each operation we do
does
a small amount of work on it until eventually the whole
each operation we do a small amount of work on it until eventually the whole
block is constructed.
However, insertion is not the only operation, we can also ask queries even
...
...
@@ -387,7 +316,7 @@ $\log n$ blocks in construction.
\theorem
{
Let
$
S
$
be a static data structure answering a decomposable problem
$
f
$
. Then
there exists semidynamic structure with parameters
there exists semi
-
dynamic structure with parameters
\tightlist
{
o
}
\:
$
Q
_
D
(
n
)
\in
\O
(
Q
_
S
(
n
)
\cdot
\log
_
n
)
$
,
...
...
@@ -409,15 +338,15 @@ together, we get the required upper bound.
\subsection
{
Full dynamization
}
For our definition of search problems, it is not easy to delete elements, as
anytime we wished to delete an element we would need to take apart and split a
structure into a few smaller ones. This
c
ould never
be able
to amortize to
any
time we wished to delete an element
,
we would need to take apart and split a
structure into a few smaller ones. This
w
ould never to amortize to
a
decent deletion time.
Instead of that, we will want the underlying static structure to have an
Instead of that, we will want the underlying static
s
structure to have an
ability to cross out elements. These elements will no longer participate in
queries, but they will count towards the structure size and complexity.
Once we have ability to cross out elements, we can upgrade the semidynamic data
Once we have ability to cross out elements, we can upgrade the semi
-
dynamic data
structure to support deletion. We add a binary search tree or another set
structure which maps each element to a block it lives in. For each element we
keep a pointer on its instance in the BST. When we build a new block, we can
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment