diff --git a/fs-succinct/composition.asy b/fs-succinct/composition.asy index a382b3043279997f9cb221fd31420c591b60db5d..ef955a47192a498e098b4ab9b2678ea292c851fd 100644 --- a/fs-succinct/composition.asy +++ b/fs-succinct/composition.asy @@ -5,9 +5,9 @@ //draw(roundrectangle("g", (1,-1))); //draw(roundrectangle("h", (-1,-2))); -object f1 = draw("$f_1$", roundbox, (0,0), xmargin=0.5, ymargin=0.5); -object f2 = draw("$f_2$", roundbox, (1cm,-1cm), xmargin=0.5, ymargin=0.5); -object f3 = draw("$f_3$", roundbox, (-1cm,-1.5cm), xmargin=0.5, ymargin=0.5); +object f1 = draw("$g_1$", roundbox, (0,0), xmargin=0.5, ymargin=0.5); +object f2 = draw("$g_2$", roundbox, (1cm,-1cm), xmargin=0.5, ymargin=0.5); +object f3 = draw("$g_3$", roundbox, (-1cm,-1.5cm), xmargin=0.5, ymargin=0.5); // XXX this does not work when setting unitsize draw(point(f1, SE) -- point(f2, NW), Arrow); @@ -20,5 +20,5 @@ draw(point(f2, S) -- (xpart(point(f2, S)), -2.5cm), Arrow); draw(point(f3, S) -- (xpart(point(f3, S)), -2.5cm), Arrow); draw((xpart(point(f1, N)), 1cm) -- point(f1, N), Arrow); -label("$f$", (xpart(min(currentpicture)), ypart(max(currentpicture))) + (0.25cm, -0.25cm)); +label("$g$", (xpart(min(currentpicture)), ypart(max(currentpicture))) + (0.25cm, -0.25cm)); diff --git a/fs-succinct/succinct.tex b/fs-succinct/succinct.tex index b6b1679e5e9c73869bdf7fb505000f1f8e45fee5..406bcacd234ffbdbf5bd046e7950a7d4cd8eaf06 100644 --- a/fs-succinct/succinct.tex +++ b/fs-succinct/succinct.tex @@ -265,7 +265,11 @@ number of words): } All these properties should be evident from the construction. -\defn{The redundancy of a mixer is $$r(f) := \underbrace{M + \log S}_{\hbox{output entropy}} - \quad \underbrace{(\log X + \log Y)}_{\hbox{input entropy}}.$$} +\defn{The redundancy of a mixer is $$r(f) := \underbrace{M + \log S}_{\hbox{output entropy}} - \quad \underbrace{(\log X + \log Y)}_{\hbox{input entropy}}.$$ +In general, the redundancy of a mapping (with possibly multiple inputs and multiple outputs) is the sum of the logs of the +output alphabet size, minus the sum of the logs of the input alphabet sizes. Note that there is no rounding (because the inputs and +outputs can be from arbitrary alphabets, not necessarily binary) and the redundancy can be non-integer. Compare this to the concept +of redundancy for space-efficient datastructures defined above.} \subsection{On the existence of certain kinds of mixers} @@ -287,20 +291,60 @@ $C := \lfloor 2^M / Y \rfloor$. Now let us calculate the redundancy. First we shall note that we can compute redundancy for $f_1$ and $f_2$ separately and add them up: -$$\eqalign{r(f) &= M + \lceil\log S\rceil - \lceil\log X\rceil - \lceil\log Y\rceil \cr -&= \left(M - \lceil\log C\rceil - \lceil\log Y\rceil\right) + \left(\lceil\log C\rceil + \lceil\log S\rceil - \lceil\log X\rceil\right)\cr +$$\eqalign{r(f) &= M + \log S - \log X - \log Y \cr +&= \left(M - \log C - \log Y\right) + \left(\log C + \log S - \log X\right)\cr &= r(f_2) + r(f_1)}$$ } This is just a telescopic sum. It works similarly for more complex mapping compositions: as long as each intermediate result is used only once as an input to another mapping, you can just sum the redundancies of all the mappings involved. -For example, if you have a mapping composition as in fig. \figref{composition}, -you can easily see $r(f) = r(f_1) + r(f_2) + r(f_3)$. \figure[composition]{composition.pdf}{}{Mapping composition} +For example, if you have a mapping composition as in fig. \figref{composition}, +you can easily see $r(g) = r(g_1) + r(g_2) + r(g_3)$. For every edge fully inside +the composition, the same number is added once and subtracted once. First, we shall estimate $r(f_2)$: $$\eqalign{r(f_2) &= M - \log(Y\cdot C)= M - \log(\overbrace{Y\cdot \lfloor 2^M / Y \rfloor}^{\ge 2^M - Y})\cr r(f_2) &\le M - \log(2^M-Y)= \log{2^M\over 2^M-Y} = \log{1 \over 1-{Y \over 2^M}}}$$ +Now we shall use a well-known inequality form analysis: +$$\eqalign{ +e^x &\ge 1+x\cr +x &\ge \log(1+x)\cr +-x &\le \log{1 \over 1+x}}$$ +By substituting $x \rightarrow -x$ we get: +$$x \ge \log{1 \over 1-x}$$ +Thus +$$r(f_2) \le {Y\over 2^M} = \O\left({1 \over C}\right)$$ + +Now to $r(f_1)$: +$$\eqalign{ +r(f_1) &= \log C + \log S - \log X = \log C + \log \left\lceil {X\over C}\right\rceil - \log X += \log\left({C\left\lceil{X \over C}\right\rceil \over X}\right)\cr +r(f_1) &\le \log\left({X+C \over X}\right) = \log\left(1 + {C\over X}\right) \le {C \over X}\qquad\hbox{(because $\log(x) \le x-1$)} +}$$ + +Putting this together: +$$r(f) = r(f_1) + r(f_2) \le \O\left({1 \over C} + {C \over X}\right)$$ + +In order to minimize this sum, we should set $C = \Theta\left(\sqrt{X}\right)$. Then +$r(f) = \O\left({1/\sqrt{X}}\right)$ and $S = \left\lceil{X \over \Theta(\sqrt{X})}\right\rceil = \Theta\left(\sqrt{X}\right)$ +as promised. Note that this holds for any value of $Y$. + +However, we cannot freely set $C$, as we have already decided that $C := \lfloor 2^M / Y \rfloor$. +Instead, we need to set a value for $M$ that gives us the right $C$. + +Now we are almost done. The whole mixer parameter selection process could be as follows +(it may be useful to refer back to fig. \figref{mixer}): +\tightlist{n.} +\: We are given $X$, $Y$ as parameters. +\: Set $M := \left\lceil\log\left(Y\sqrt{X}\right)\right\rceil$. +\: Set $C := \left\lfloor 2^M / Y \right\rfloor$. This ensures that $2^M \ge C\cdot Y$ and gives us $C = \Theta\left(\sqrt{X}\right)$. +\: Set $S := \left\lceil X / C \right\rceil$. This ensures that $C\cdot S \ge X$ and gives us $S = \Theta\left(\sqrt{X}\right)$. +\endlist +All the inequalities required for mixer existence are satisfied and based on the analysis +above the parameters satisfy what our lemma promised. +\qed + \endchapter