Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
D
ds2-notes
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Model registry
Analyze
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
datovky
ds2-notes
Commits
cdaacc99
Commit
cdaacc99
authored
4 years ago
by
Parth Mittal
Browse files
Options
Downloads
Patches
Plain Diff
wrote intro to streaming / frequent elements
parent
66411d7c
Branches
Branches containing commit
No related tags found
No related merge requests found
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
streaming/Makefile
+3
-0
3 additions, 0 deletions
streaming/Makefile
streaming/streaming.tex
+41
-0
41 additions, 0 deletions
streaming/streaming.tex
with
44 additions
and
0 deletions
streaming/Makefile
0 → 100644
+
3
−
0
View file @
cdaacc99
TOP
=
..
include
../Makerules
This diff is collapsed.
Click to expand it.
streaming/streaming.tex
0 → 100644
+
41
−
0
View file @
cdaacc99
\ifx\chapter\undefined
\input
adsmac.tex
\singlechapter
{
20
}
\fi
\chapter
[streaming]
{
Streaming Algorithms
}
For this chapter, we will consider the streaming model. In this
setting, the input is presented as a ``stream'' which we can read
\em
{
in order
}
. In particular, at each step, we can do some processing,
and then move forward one unit in the stream to read the next piece of data.
We can choose to read the input again after completing a ``pass'' over it.
There are two measures for the performance of algorithms in this setting.
The first is the number of passes we make over the input, and the second is
the amount of memory that we consume. Some interesting special cases are:
\tightlist
{
o
}
\:
1 pass, and
$
O
(
1
)
$
memory: This is equivalent to computing with a DFA, and
hence we can recognise only regular languages.
\:
1 pass, and unbounded memory: We can store the entire stream, and hence this
is just the traditional computing model.
\endlist
\section
{
Frequent Elements
}
For this problem, the input is a stream
$
\alpha
[
1
\ldots
m
]
$
where each
$
\alpha
[
i
]
\in
[
n
]
$
.
We define for each
$
j
\in
[
n
]
$
the
\em
{
frequency
}
$
f
_
j
$
which counts
the occurences of
$
j
$
in
$
\alpha
[
1
\ldots
m
]
$
. Then the majority problem
is to find (if it exists) a
$
j
$
such that
$
f
_
j > m
/
2
$
.
We consider the more general frequent elements problem, where we want to find
$
F
_
k
=
\{
j
\mid
f
_
j > m
/
k
\}
$
. Suppose that we (magically) knew some small set
$
C
$
which contains
$
F
_
k
$
. Then we can pass over the input once, keeping track of
how many times we see each member of
$
C
$
, and then find
$
F
_
k
$
easily.
The challenge is to find a small
$
C
$
, which is precisely what the Misra/Gries
Algorithm does.
\subsection
{
Misra/Gries Algorithm
}
\endchapter
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment