Succinct: Mixer parameter selecion, lemma proof completed

680f8888 · Filip Stedronsky · 7358e5e9 · 680f8888 · 680f8888
Commit 680f8888 authored 3 years ago by Filip Stedronsky
--- a/fs-succinct/composition.asy
+++ b/fs-succinct/composition.asy
@@ -5,9 +5,9 @@
 //draw(roundrectangle("g", (1,-1)));
 //draw(roundrectangle("h", (-1,-2)));

-object f1 = draw("$f_1$", roundbox, (0,0),     xmargin=0.5, ymargin=0.5);
-object f2 = draw("$f_2$", roundbox, (1cm,-1cm),    xmargin=0.5, ymargin=0.5);
-object f3 = draw("$f_3$", roundbox, (-1cm,-1.5cm), xmargin=0.5, ymargin=0.5);
+object f1 = draw("$g_1$", roundbox, (0,0),     xmargin=0.5, ymargin=0.5);
+object f2 = draw("$g_2$", roundbox, (1cm,-1cm),    xmargin=0.5, ymargin=0.5);
+object f3 = draw("$g_3$", roundbox, (-1cm,-1.5cm), xmargin=0.5, ymargin=0.5);

 // XXX this does not work when setting unitsize
 draw(point(f1, SE) -- point(f2, NW), Arrow);
@@ -20,5 +20,5 @@ draw(point(f2, S) -- (xpart(point(f2, S)), -2.5cm), Arrow);
 draw(point(f3, S) -- (xpart(point(f3, S)), -2.5cm), Arrow);
 draw((xpart(point(f1, N)), 1cm) -- point(f1, N), Arrow);

-label("$f$", (xpart(min(currentpicture)), ypart(max(currentpicture))) + (0.25cm, -0.25cm));
+label("$g$", (xpart(min(currentpicture)), ypart(max(currentpicture))) + (0.25cm, -0.25cm));

--- a/fs-succinct/succinct.tex
+++ b/fs-succinct/succinct.tex
@@ -265,7 +265,11 @@ number of words):
 }
 All these properties should be evident from the construction.

-\defn{The redundancy of a mixer is $$r(f) := \underbrace{M + \log S}_{\hbox{output entropy}}  - \quad \underbrace{(\log X + \log Y)}_{\hbox{input entropy}}.$$}
+\defn{The redundancy of a mixer is $$r(f) := \underbrace{M + \log S}_{\hbox{output entropy}}  - \quad \underbrace{(\log X + \log Y)}_{\hbox{input entropy}}.$$
+In general, the redundancy of a mapping (with possibly multiple inputs and multiple outputs) is the sum of the logs of the
+output alphabet size, minus the sum of the logs of the input alphabet sizes. Note that there is no rounding (because the inputs and
+outputs can be from arbitrary alphabets, not necessarily binary) and the redundancy can be non-integer. Compare this to the concept
+of redundancy for space-efficient datastructures defined above.}

 \subsection{On the existence of certain kinds of mixers}

@@ -287,20 +291,60 @@ $C := \lfloor 2^M / Y \rfloor$.

 Now let us calculate the redundancy. First we shall note that we can compute redundancy
 for $f_1$ and $f_2$ separately and add them up:
-$$\eqalign{r(f) &= M + \lceil\log S\rceil - \lceil\log X\rceil - \lceil\log Y\rceil \cr
-&= \left(M - \lceil\log C\rceil - \lceil\log Y\rceil\right) + \left(\lceil\log C\rceil + \lceil\log S\rceil - \lceil\log X\rceil\right)\cr
+$$\eqalign{r(f) &= M + \log S - \log X - \log Y \cr
+&= \left(M - \log C - \log Y\right) + \left(\log C + \log S - \log X\right)\cr
 &= r(f_2) + r(f_1)}$$
 }
 This is just a telescopic sum. It works similarly for more complex mapping compositions:
 as long as each intermediate result is used only once as an input to another mapping, you
 can just sum the redundancies of all the mappings involved.

-For example, if you have a mapping composition as in fig. \figref{composition},
-you can easily see $r(f) = r(f_1) + r(f_2) + r(f_3)$.
 \figure[composition]{composition.pdf}{}{Mapping composition}
+For example, if you have a mapping composition as in fig. \figref{composition},
+you can easily see $r(g) = r(g_1) + r(g_2) + r(g_3)$. For every edge fully inside
+the composition, the same number is added once and subtracted once.

 First, we shall estimate $r(f_2)$:
 $$\eqalign{r(f_2) &= M - \log(Y\cdot C)= M - \log(\overbrace{Y\cdot \lfloor 2^M / Y \rfloor}^{\ge 2^M - Y})\cr 
 r(f_2) &\le M - \log(2^M-Y)= \log{2^M\over 2^M-Y} = \log{1 \over 1-{Y \over 2^M}}}$$
+Now we shall use a well-known inequality form analysis:
+$$\eqalign{
+e^x &\ge 1+x\cr
+x &\ge \log(1+x)\cr
+-x &\le \log{1 \over 1+x}}$$
+By substituting $x \rightarrow -x$ we get:
+$$x \ge \log{1 \over 1-x}$$
+Thus
+$$r(f_2) \le {Y\over 2^M} = \O\left({1 \over C}\right)$$
+
+Now to $r(f_1)$:
+$$\eqalign{
+r(f_1) &= \log C + \log S - \log X = \log C + \log \left\lceil {X\over C}\right\rceil - \log X
+= \log\left({C\left\lceil{X \over C}\right\rceil \over X}\right)\cr
+r(f_1) &\le \log\left({X+C \over X}\right) = \log\left(1 + {C\over X}\right) \le {C \over X}\qquad\hbox{(because $\log(x) \le x-1$)}
+}$$
+
+Putting this together:
+$$r(f) = r(f_1) + r(f_2) \le \O\left({1 \over C} + {C \over X}\right)$$
+
+In order to minimize this sum, we should set $C = \Theta\left(\sqrt{X}\right)$. Then
+$r(f) = \O\left({1/\sqrt{X}}\right)$ and $S = \left\lceil{X \over \Theta(\sqrt{X})}\right\rceil = \Theta\left(\sqrt{X}\right)$
+as promised. Note that this holds for any value of $Y$.
+
+However, we cannot freely set $C$, as we have already decided that $C := \lfloor 2^M / Y \rfloor$.
+Instead, we need to set a value for $M$ that gives us the right $C$.
+
+Now we are almost done. The whole mixer parameter selection process could be as follows
+(it may be useful to refer back to fig. \figref{mixer}):
+\tightlist{n.}
+\: We are given $X$, $Y$ as parameters.
+\: Set $M := \left\lceil\log\left(Y\sqrt{X}\right)\right\rceil$.
+\: Set $C := \left\lfloor 2^M / Y \right\rfloor$. This ensures that $2^M \ge C\cdot Y$ and gives us $C = \Theta\left(\sqrt{X}\right)$.
+\: Set $S := \left\lceil X / C \right\rceil$. This ensures that $C\cdot S \ge X$ and gives us $S =  \Theta\left(\sqrt{X}\right)$.
+\endlist
+All the inequalities required for mixer existence are satisfied and based on the analysis
+above the parameters satisfy what our lemma promised.
+\qed
+

 \endchapter