From cdaacc99c1191a2046ff2bd07ae28ec389f68e16 Mon Sep 17 00:00:00 2001
From: Parth Mittal <parth15069@iiitd.ac.in>
Date: Sun, 18 Apr 2021 14:47:37 +0530
Subject: [PATCH] wrote intro to streaming / frequent elements

---
 streaming/Makefile      |  3 +++
 streaming/streaming.tex | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)
 create mode 100644 streaming/Makefile
 create mode 100644 streaming/streaming.tex

diff --git a/streaming/Makefile b/streaming/Makefile
new file mode 100644
index 0000000..ba6c63e
--- /dev/null
+++ b/streaming/Makefile
@@ -0,0 +1,3 @@
+TOP=..
+
+include ../Makerules
diff --git a/streaming/streaming.tex b/streaming/streaming.tex
new file mode 100644
index 0000000..cf635cb
--- /dev/null
+++ b/streaming/streaming.tex
@@ -0,0 +1,41 @@
+\ifx\chapter\undefined
+\input adsmac.tex
+\singlechapter{20}
+\fi
+
+\chapter[streaming]{Streaming Algorithms}
+
+For this chapter, we will consider the streaming model. In this
+setting, the input is presented as a ``stream'' which we can read
+\em{in order}. In particular, at each step, we can do some processing,
+and then move forward one unit in the stream to read the next piece of data.
+We can choose to read the input again after completing a ``pass'' over it.
+
+There are two measures for the performance of algorithms in this setting.
+The first is the number of passes we make over the input, and the second is
+the amount of memory that we consume. Some interesting special cases are:
+\tightlist{o}
+\: 1 pass, and $O(1)$ memory: This is equivalent to computing with a DFA, and
+hence we can recognise only regular languages.
+\: 1 pass, and unbounded memory: We can store the entire stream, and hence this
+is just the traditional computing model.
+\endlist
+
+\section{Frequent Elements}
+
+For this problem, the input is a stream $\alpha[1 \ldots m]$ where each
+$\alpha[i] \in [n]$.
+We define for each $j \in [n]$ the \em{frequency} $f_j$ which counts
+the occurences of $j$ in $\alpha[1 \ldots m]$. Then the majority problem
+is to find (if it exists) a $j$ such that $f_j > m / 2$.
+
+We consider the more general frequent elements problem, where we want to find
+$F_k = \{ j \mid f_j > m / k \}$. Suppose that we (magically) knew some small set
+$C$ which contains $F_k$. Then we can pass over the input once, keeping track of
+how many times we see each member of $C$, and then find $F_k$ easily.
+The challenge is to find a small $C$, which is precisely what the Misra/Gries
+Algorithm does.
+
+\subsection{Misra/Gries Algorithm}
+
+\endchapter
-- 
GitLab