diff --git a/fs-succinct/succinct.tex b/fs-succinct/succinct.tex index 5a07c66f128c8a0019436032fc2d3546a8be71e4..151e8c862f8dee7438d7496c73aaca2e1d9a19a1 100644 --- a/fs-succinct/succinct.tex +++ b/fs-succinct/succinct.tex @@ -11,31 +11,33 @@ sctructures have linear space complexity, which is asymptotically optimal. However, in this chapter, we shall use a much more fine-grained notion of space efficiency and measure space requirements in bits. -Imagine we have a universe $X$ and we want a data structure to hold a single -element $x \in X$. For example, $X$ can be the universe of all length-$n$ sequences -of integers from the range $[m]$ and we want a data structure to hold such -a sequence (we shall limit ourselves to finite universes only). +Imagine we have a data structure whose size is parametrized by some parameter +$n$ (e.g. number of elements). Let us define $X(n)$ as the universe of all possible +values that a size-$n$ data structure (as a whole) can hold. For example if we +have a data structure for storing strings from a fixed alphabet, $X(n)$ may be the +universe of all length-$n$ strings from this alphabet. -The information-theoretically optimal size of such a data structure is -$OPT := \lceil\log |X|\rceil$ bits (which is essentilly the entropy of -a uniform probability distribution over $X$). +Let us denote $s(n)$ the number of bits needed to store a size-$n$ data structure. +The information-theoretical optimum is $OPT(n) := \lceil\log |X(n)|\rceil$ +(which is essentially the entropy of a uniform distribution over $X(n)$). Now we can define three classes of data structures based on their fine-grained space efficiency: -\defn{An {\I implicit data structure} is one that uses at most $OPT + \O(1)$ bits of space.} +\defn{An {\I implicit data structure} is one that uses at most $OPT(n) + \O(1)$ bits of space.} A typical implicit data structure contains just its elements in some order and nothing more. Examples include sorted arrays and heaps. -\defn{A {\I succinct data structure} is one that uses at most $OPT + {\rm o}(OPT)$ bits of space.} -\defn{A {\I compact data structure} is one that uses at most $\O(OPT)$ bits of space.} +\defn{A {\I succinct data structure} is one that uses at most $OPT(n) + {\rm o}(OPT(n))$ bits of space.} +\defn{A {\I compact data structure} is one that uses at most $\O(OPT(n))$ bits of space.} -Note that some linear-space data structures are not even compact -- because we are counting -bits now, not words. For example, a binary search tree representing a length-$n$ sequence -of numbers from range $[m]$ needs $\O(n (\log n + \log m))$ bits, whereas $OPT$ is -$n \log m$. For $n >> m$, this does not satisfy the requirements for a compact data -structure. +Note that some linear-space data structures are not even compact -- because we +are counting bits now, not words. For example, a linked list representing a +length-$n$ sequence of numbers from range $[m]$ needs $\O(n (\log n + \log m))$ +bits ($\log n$ bits are used to represent a next-pointer), whereas $OPT$ is $n +\log m$. For $n \gg m$, this does not satisfy the requirements for a compact +data structure. And of course, as with any data structure, we want to be able to perform reasonably fast operations on these space-efficient data structures.