diff --git a/streaming/streaming.tex b/streaming/streaming.tex index 395dab6543c4fd28c545ffdded870cc1dbc1f02e..5cc4b717d4289035397bc3ab8761ce492c698610 100644 --- a/streaming/streaming.tex +++ b/streaming/streaming.tex @@ -42,18 +42,16 @@ of each element in a stream of integers. We shall see that it also provides us with a small set $C$ containing $F_k$, and hence lets us solve the frequent elements problem efficiently. -TODO: Typeset the algorithm better. - -\proc{FrequencyEstimate}$(\alpha, k)$ +\algo{FrequencyEstimate} \algalias{Misra/Gries Algorithm} \algin the data stream $\alpha$, the target for the estimator $k$ -\:\em{Init}: $A \= \emptyset$. (an empty map) +\:\em{Init}: $A \= \emptyset$. \cmt{an empty map} \:\em{Process}($x$): -\: If $x \in$ keys($A$), $A[x] \= A[x] + 1$. -\: Else If $\vert$keys($A$)$\vert < k - 1$, $A[x] \= 1$. -\: Else +\::If $x \in$ keys($A$), $A[x] \= A[x] + 1$. +\::Else If $\vert$keys($A$)$\vert < k - 1$, $A[x] \= 1$. +\::Else \forall $a \in $~keys($A$): $A[a] \= A[a] - 1$, delete $a$ from $A$ if $A[a] = 0$. -\:\em{Output}: $\hat{f}_a = A[a]$ If $a \in $~keys($A$), and $\hat{f}_a = 0$ otherwise. +\algout $\hat{f}_a = A[a]$ If $a \in $~keys($A$), and $\hat{f}_a = 0$ otherwise. \endalgo Let us show that $\hat{f}_a$ is a good estimate for the frequency $f_a$. @@ -95,6 +93,10 @@ $\vert C \vert = \vert$keys($A$)$\vert \leq k - 1$, and a key-value pair can be stored in $\O(\log n + \log m)$ bits. \qed +\subsection{The Count-Min sketch} + +We will now look at a randomized streaming algorithm that performs the same task + \endchapter