Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
D
ds2-notes
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Model registry
Analyze
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
datovky
ds2-notes
Commits
93cdcbe8
Commit
93cdcbe8
authored
6 years ago
by
Martin Mareš
Browse files
Options
Downloads
Patches
Plain Diff
Geometry: Range trees (partial)
parent
db3eeb02
No related branches found
No related tags found
No related merge requests found
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
07-geom/geom.tex
+82
-7
82 additions, 7 deletions
07-geom/geom.tex
with
82 additions
and
7 deletions
07-geom/geom.tex
+
82
−
7
View file @
93cdcbe8
...
@@ -29,13 +29,13 @@ Queries are asked about all objects which lie within a~given \em{region}
...
@@ -29,13 +29,13 @@ Queries are asked about all objects which lie within a~given \em{region}
We might want to
\em
{
enumerate
}
all objects within the region,
We might want to
\em
{
enumerate
}
all objects within the region,
or just
\em
{
count
}
them enumerating all of them.
or just
\em
{
count
}
them enumerating all of them.
enumerating all of them.
If there is a~value associated with each
If there is a~value associated with each
object, we can ask for a~sum or a~maximum of values of all objects
object, we can ask for a~sum or a~maximum of values of all objects
within the range --- this is generally called an
\em
{
aggregation
}
within the range --- this is generally called an
\em
{
aggregation
}
query.
query.
In this chapter, we will consider the simple case of range queries
In this chapter, we will consider the simple case of range queries
on points in~
$
\R
^
d
$
on points in~
$
\R
^
d
$
.
\section
[range1d]
{
Range queries in one dimension
}
\section
[range1d]
{
Range queries in one dimension
}
...
@@ -75,20 +75,20 @@ We can see that the sets have the following properties:
...
@@ -75,20 +75,20 @@ We can see that the sets have the following properties:
This suggests a~straightforward recursive algorithm for answering range queries.
This suggests a~straightforward recursive algorithm for answering range queries.
\algo
{
Int
Query
}$
(
v,Q
)
$
\algo
{
Range
Query
}$
(
v,Q
)
$
\algin
a~root of a~subtree~
$
v
$
, query range~
$
Q
$
\algin
a~root of a~subtree~
$
v
$
, query range~
$
Q
$
\:
If
$
v
$
~is external, return.
\:
If
$
v
$
~is external, return.
\:
If
$
\intr
(
v
)
\subseteq
Q
$
, report the whole subtree rooted at~
$
v
$
and return.
\:
If
$
\intr
(
v
)
\subseteq
Q
$
, report the whole subtree rooted at~
$
v
$
and return.
\:
If
$
\key
(
v
)
\in
Q
$
, report the item at~
$
v
$
.
\:
If
$
\key
(
v
)
\in
Q
$
, report the item at~
$
v
$
.
\:
$
Q
_
\ell
\=
Q
\cap
\intr
(
\ell
(
v
))
$
,
$
Q
_
r
\=
Q
\cap
\intr
(
r
(
v
))
$
\:
$
Q
_
\ell
\=
Q
\cap
\intr
(
\ell
(
v
))
$
,
$
Q
_
r
\=
Q
\cap
\intr
(
r
(
v
))
$
\:
If
$
Q
_
\ell
\ne
\emptyset
$
: call
$
\alg
{
Int
Query
}
(
\ell
(
v
)
, Q
_
\ell
)
$
\:
If
$
Q
_
\ell
\ne
\emptyset
$
: call
$
\alg
{
Range
Query
}
(
\ell
(
v
)
, Q
_
\ell
)
$
\:
If
$
Q
_
r
\ne
\emptyset
$
: call
$
\alg
{
Int
Query
}
(
r
(
v
)
, Q
_
r
)
$
\:
If
$
Q
_
r
\ne
\emptyset
$
: call
$
\alg
{
Range
Query
}
(
r
(
v
)
, Q
_
r
)
$
\endalgo
\endalgo
Let us analyze time complexity of this algorithm now.
Let us analyze time complexity of this algorithm now.
\lemma
{
\lemma
{
If the tree is balanced,
\alg
{
Int
Query
}
called on its root visits
$
\O
(
\log
If the tree is balanced,
\alg
{
Range
Query
}
called on its root visits
$
\O
(
\log
n
)
$
nodes and subtrees.
n
)
$
nodes and subtrees.
}
}
...
@@ -163,7 +163,7 @@ space.
...
@@ -163,7 +163,7 @@ space.
We can answer 2-d range queries similarly to the 1-d case. To each node~
$
v
$
of the
We can answer 2-d range queries similarly to the 1-d case. To each node~
$
v
$
of the
tree, we can assign a~2-d interval
$
\intr
(
v
)
$
recursively. This again generates a~hierarchy
tree, we can assign a~2-d interval
$
\intr
(
v
)
$
recursively. This again generates a~hierarchy
of nested intervals, so the
\alg
{
Int
Query
}
algorithm works there, too.
of nested intervals, so the
\alg
{
Range
Query
}
algorithm works there, too.
However, 2-d range queries can be very slow in the worst case:
However, 2-d range queries can be very slow in the worst case:
\lemma
{
Worst-case time complexity of range queries in a~2-d tree is
$
\Omega
(
\sqrt
n
)
$
.
}
\lemma
{
Worst-case time complexity of range queries in a~2-d tree is
$
\Omega
(
\sqrt
n
)
$
.
}
...
@@ -199,6 +199,81 @@ Dynamization is non-trivial and we will not show it.
...
@@ -199,6 +199,81 @@ Dynamization is non-trivial and we will not show it.
\section
{
Multi-dimensional range trees
}
\section
{
Multi-dimensional range trees
}
The
$
k
$
-dimensional search trees were simple, but slow in the worst case.
There is a~more efficient date structure: the
\em
{
multi-dimensional range tree,
}
which has poly-logarithmic query complexity, if we are willing to use
super-linear space.
\subsection
{
2-dimensional range trees
}
For simplicity, we start with a~static 2-dimensional version
and we will assume that no two points have the same
$
x
$
~coordinate.
The 2-d range tree will consist of multiple instances of a~1-d range tree,
which we built in section
\secref
{
range1d
}
--- it can be a~binary search tree
with range queries, but in the static case even a~sorted array suffices.
First we create an~
$
x
$
-tree, which is a~1-d range over the
$
x
$
~coordinates
of all points stored in the structure. Each node contains a~single point.
Its subtree corresponds to an~open interval of
$
x
$
~coordinates, that is
a~
\em
{
band
}
in~
$
\R
^
2
$
(an~open rectangle which is vertically unbounded).
For every band, we construct a~
$
y
$
-tree containing all points in the band
ordered by the
$
y
$
~coordinate.
If the
$
x
$
-tree is balanced, every node lies in
$
\O
(
log n
)
$
subtrees.
So every point lies in
$
\O
(
\log
n
)
$
bands and the whole structure takes
$
\O
(
n
\log
n
)
$
space:
$
\O
(
n
)
$
for the
$
x
$
-tree,
$
\O
(
n
\log
n
)
$
for all
$
y
$
-trees.
We can build the 2-d tree recursively. First we create two lists of points:
one sorted by the
$
x
$
~coordinate, one by~
$
y
$
. Then we construct the
$
x
$
-tree.
We find the point with median~
$
x
$
coordinate in constant time. This point
becomes the root of the
$
x
$
-tree. We recursively construct the left subtree
from points which less than median
$
x
$
~coordinate --- we can construct the
corresponding sub-lists of both the
$
x
$
and~
$
y
$
list in
$
\O
(
n
)
$
time. We
construct the right subtree similarly. Finally, we build the
$
y
$
-tree
for the root: it contains all the points and we can build it from the
$
y
$
-sorted
array in
$
\O
(
n
)
$
time.
The whole building algorithm requires linear time per sub-problem, which
sums to
$
\O
(
n
)
$
over one level of the
$
x
$
-tree. Since the
$
x
$
-tree is
logarithmically high, it makes
$
\O
(
n
\log
n
)
$
for the whole construction.
Now we describe how to answer a~range query for
$
[
x
_
1
,x
_
2
]
\times
[
y
_
1
,y
_
2
]
$
.
First we let the
$
x
$
-tree answer a~range query for
$
[
x
_
1
,x
_
2
]
$
. This gives us
a~union of
$
\O
(
\log
n
)
$
points and bands which disjointly cover
$
[
x
_
1
,x
_
2
]
$
.
For each point, we test if its
$
y
$
~coordinate lies in
$
[
y
_
1
,y
_
2
]
$
. For each
band, we ask the corresponding
$
y
$
-tree for points in the range
$
[
y
_
1
,y
_
2
]
$
.
We spend
$
\O
(
\log
n
)
$
time in the
$
x
$
-tree,
$
\O
(
\log
n
)
$
time to process the
individual points,
$
\O
(
\log
n
)
$
in each
$
y
$
-tree, and
$
\O
(
1
)
$
per point
reported. Put together, this is
$
\O
(
\log
^
2
n
+
p
)
$
if
$
p
$
~points are
reported (
$
p
=
0
$
for a~counting query).
\subsection
{
Handling repeated coordinates
}
We left aside the case of multiple points with the same
$
x
$
~coordinate.
This can be handled by attaching another
$
y
$
-tree to each
$
x
$
-tree node,
which contains nodes sharing the same
$
x
$
~coordinate. That is,
$
x
$
-tree nodes correspond to distinct
$
x
$
~coordinates and each
has two
$
y
$
-trees: one for its own
$
x
$
~coordinate, one for the open
interval of
$
x
$
~coordinates covering its subtree. This way, we can
perform range queries for both open and closed ranges.
Time complexity of
\alg
{
Build
}
stays asymptotically the same: the maximum
number of
$
y
$
-trees containing a~given point increases twice, so it is
still
$
\O
(
\log
n
)
$
. Similarly for range queries: we query at most twice
as much
$
y
$
-trees.
\subsection
{
Multi-dimensional generalization
}
TODO
\subsection
{
Dynamization
}
TODO
\subsection
{
Fractional cascading
}
TODO
TODO
\endchapter
\endchapter
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment