Skip to content
Snippets Groups Projects
Commit db3eeb02 authored by Martin Mareš's avatar Martin Mareš
Browse files

Geometry: k-d trees (a little bit sketchy)

parent 5f8dc980
No related branches found
No related tags found
No related merge requests found
......@@ -138,7 +138,64 @@ in $\O(\log n)$ time.
\section{Multi-dimensional search trees (k-d trees)}
TODO
Binary search trees can be extended to $k$-dimensional trees, or simply $k$-d trees.
Keys stored in nodes are points in~$\R^d$, nodes at the $\ell$-th level compare
the $(\ell\bmod d)$-th coordinate. We will temporarily assume that no two points
share the same value of any coordinate, so every point different from the node's
key belongs either to the left subtree, or the right one.
Let us show a~static 2-dimensional example. The root compares the~$x$ coordinate, so
it splits the plane by a~vertical line to two half-planes. The children of the root
split their half-planes by horizontal lines to quarter-planes, and so on.
We can build a~perfectly balanced $k$-d tree recursively. We place the point
with median $x$~coordinate in the root. We split the points to the two half-planes
and recurse on each half-plane, which constructs the left and right subtree.
During the recursion, we alternate coordinates. As the number of points in the
current subproblem decreases by a~factor of two in every recursive call, we obtain
a~tree a~tree of height $\lceil\log n\rceil$.
Time complexity of building can be analyzed using the recursion tree: since we can find a~median
of~$m$ items in time $\O(m)$, we spend $\O(n)$ time per tree level. We have $\O(\log n)$
levels, so the whole construction runs in $\O(n\log n)$ time. Since every node
of the 2-d tree contains a~different point, the whole tree consumes $\O(n)$
space.
We can answer 2-d range queries similarly to the 1-d case. To each node~$v$ of the
tree, we can assign a~2-d interval $\intr(v)$ recursively. This again generates a~hierarchy
of nested intervals, so the \alg{IntQuery} algorithm works there, too.
However, 2-d range queries can be very slow in the worst case:
\lemma{Worst-case time complexity of range queries in a~2-d tree is $\Omega(\sqrt n)$.}
\proof
Consider a~tree built for the set of points $\{(i,i) \mid 1\le i\le n\}$ for
$n=2^t-1$. It is a~complete binary tree with $t$~levels. Let us observe what
happens when we query a~range $\{0\} \times \R$ (unbounded intervals are not
necessary for our construction, a~very narrow and very high box would work,
too).
On levels where the $x$~coordinate is compared, we always go to the left.
On levels comparing~$y$, both subtrees lie in the query range, so we recurse
on both of them. This means that the number of visited nodes doubles at every
other level, so at level~$t$ we visit $2^{t/2} \approx \sqrt{n}$ nodes.
\qed
\note{
This is the worst which can happen, so query complexity is $\Theta(\sqrt n)$.
General $k$-d trees for arbitrary~$k$ can be built in time $\O(n\log n)$,
they require space $\O(n)$, and they answer range queries in time $\O(n^{1-1/d})$\foot{
Constants hidden in the~$\O$ depend on the dimension.}
This is quite bad in high dimension, but there is a~matching lower bound
for structures which fit in linear space.
Repeated values of coordinates can be handled by allowing a~third child
of every node, which is the root of a~$(k-1)$-d subtree. There we store all
points which share the compared coordinate with the parent's key. This extension
does not influence asymptotic complexity of operations.
Dynamization is non-trivial and we will not show it.
}
\section{Multi-dimensional range trees}
......
......@@ -20,7 +20,7 @@ Caching:
Geometric:
* k-d trees
- Description of k-d trees is quite sketchy and it deserves a picture
* range trees
String:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment