Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
datovky
ds2-notes
Commits
9882fc7d
Commit
9882fc7d
authored
May 08, 2021
by
Parth Mittal
Browse files
fixed typos, added another todo
parent
16df9964
Changes
1
Show whitespace changes
Inline
Side-by-side
streaming/streaming.tex
View file @
9882fc7d
...
...
@@ -106,7 +106,7 @@ can be easily combined.
\:\em
{
Init
}
:
$
C
[
1
\ldots
t
][
1
\ldots
k
]
\=
0
$
, where
$
k
\=
\lceil
2
/
\varepsilon
\rceil
$
and
$
t
\=
\lceil
\log
(
1
/
\delta
)
\rceil
$
.
\:
: Choose
$
t
$
independent hash functions
$
h
_
1
,
\ldots
h
_
t :
[
n
]
\to
[
k
]
$
, each
\:
: Choose
$
t
$
independent hash functions
$
h
_
1
,
\ldots
,
h
_
t :
[
n
]
\to
[
k
]
$
, each
from a 2-independent family.
\:\em
{
Process
}
(
$
x
$
):
\:
:For
$
i
\in
[
t
]
$
:
$
C
[
i
][
h
_
i
(
x
)]
\=
C
[
i
][
h
_
i
(
x
)]
+
1
$
.
...
...
@@ -114,7 +114,7 @@ can be easily combined.
\endalgo
Note that the algorithm needs
$
\O
(
tk
\log
m
)
$
bits to store the table
$
C
$
, and
$
\O
(
t
\log
n
)
$
bits to store the hash functions
$
h
_
1
,
\ldots
h
_
t
$
, and hence
$
\O
(
t
\log
n
)
$
bits to store the hash functions
$
h
_
1
,
\ldots
,
h
_
t
$
, and hence
uses
$
\O
(
1
/
\varepsilon
\cdot
\log
(
1
/
\delta
)
\cdot
\log
m
+
\log
(
1
/
\delta
)
\cdot
\log
n
)
$
bits. It remains to show that it computes
a good estimate.
...
...
@@ -298,7 +298,7 @@ Recall that $\E[Y_r] = d / 2^r$, so the terms in the first sum can be bounded
using Chebyshev's inequality. The second sum is equal to the probability of
the event
$
[
t
\geq
s
]
$
, that is, the event
$
Y
_{
s
-
1
}
\geq
c
/
\varepsilon
^
2
$
(since
$
z
$
is only increased when
$
B
$
becomes larger than this threshold).
We will
simply
use Markov's inequality to bound this event.
We will use Markov's inequality to bound
the probability of
this event.
Putting it all together, we have:
$$
\eqalign
{
...
...
@@ -327,7 +327,12 @@ The counter $z$ requires only $\O(\log \log n)$ bits, and $B$ has
$
\O
(
1
/
\varepsilon
^
2
)
$
entries, each of which needs
$
\O
(
\log
n
)
$
bits.
Finally, the hash function
$
h
$
needs
$
\O
(
\log
n
)
$
bits, so the total space
used is dominated by
$
B
$
, and the algorithm uses
$
\O
(
\log
n
/
\varepsilon
^
2
)
$
space.
space. As before, if we use the median trick, the space used increases to
$
\O
(
\log\delta
\cdot
\log
n
/
\varepsilon
^
2
)
$
.
(TODO: include the version of this algorithm where we save space by storing
$
(
g
(
a
)
,
{
\tt
tz
}
(
h
(
a
)))
$
instead of
$
(
a,
{
\tt
tz
}
(
h
(
a
)))
$
in
$
B
$
for some
hash function
$
g
$
as an exercise?)
\endchapter
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment