Advanced Algorithms for Massive Datasets - PowerPoint PPT Presentation

About This Presentation
Title:

Advanced Algorithms for Massive Datasets

Description:

Advanced Algorithms for Massive Datasets The power of failing Prof. Paolo Ferragina, Algoritmi per – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 19
Provided by: Paolo213
Category:

less

Transcript and Presenter's Notes

Title: Advanced Algorithms for Massive Datasets


1
Advanced Algorithmsfor Massive Datasets
  • The power of failing

2
(No Transcript)
3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
2
TTT
8
Not perfectly true but...
9
Opt k 5.45...
m/n 8
We do have an explicit formula for the optimal k
10
(No Transcript)
11
(No Transcript)
12
Other advantage no key storage
13
Crawling
  • What data structures should we use to keep track
    of the visited URLs of a crawler?
  • URLs are long
  • Check should be very fast
  • No care about small errors ( page not crawled)

Bloom Filter over crawled URLs
14
Anti-virus detection
  • D is a dictionary of virus-checksum of some given
    length z. For each position i, check
  • Brute-force check O( D F ) time
  • Trie check O( z F ) time
  • Better Solution ?
  • Build a BF on D.
  • Check Ti,iz-1 ? D, if BF answers YES
  • then warn the user or explicitly scan D

O(kF) or even better...
15
(No Transcript)
16
Upper bounds
17
Upper bounds
18
Recurring minimum for improving the estimate 2
SBF
Write a Comment
User Comments (0)
About PowerShow.com