Title: User%20Interfaces%20and%20Algorithms%20for%20Fighting%20Phishing
1User Interfaces and Algorithms for Fighting
Phishing
Jason I. HongCarnegie Mellon University
2Everyday Privacy and Security Problem
3This entire process known as phishing
4Phishing is a Plague on the Internet
- Estimated 3.5 million people have fallen for
phishing - Estimated 350m-2b direct losses a year
- 9255 unique phishing sites reported in June 2006
- Easier (and safer) to phish than rob a bank
5Project Supporting Trust Decisions
- Goal help people make better online trust
decisions - Currently focusing on anti-phishing
- Large multi-disciplinary team project at CMU
- Six faculty, five PhD students, undergrads, staff
- Computer science, human-computer interaction,
public policy, social and decision sciences, CERT
6Our Multi-Pronged Approach
- Human side
- Interviews to understand decision-making
- PhishGuru embedded training
- Anti-Phishing Phil game
- Understanding effectiveness of browser warnings
- Computer side
- PILFER email anti-phishing filter
- CANTINA web anti-phishing algorithm
Automate where possible, support where necessary
7Our Multi-Pronged Approach
- Human side
- Interviews to understand decision-making
- PhishGuru embedded training
- Anti-Phishing Phil game
- Understanding effectiveness of browser warnings
- Computer side
- PILFER email anti-phishing filter
- CANTINA web anti-phishing algorithm
What do users know about phishing?
8Interview Study
- Interviewed 40 Internet users (35 non-experts)
- Mental models interviews included email role
play and open ended questions - Brief overview of results (see paper for details)
- J. Downs, M. Holbrook, and L. Cranor. Decision
Strategies and Susceptibility to Phishing. In
Proceedings of the 2006 Symposium On Usable
Privacy and Security, 12-14 July 2006,
Pittsburgh, PA.
9Little Knowledge of Phishing
- Only about half knew meaning of the term
phishing -
- Something to do with the band Phish, I take it.
10Little Attention Paid to URLs
- Only 55 of participants said they had ever
noticed an unexpected or strange-looking URL - Most did not consider them to be suspicious
-
11Some Knowledge of Scams
- 55 of participants reported being cautious when
email asks for sensitive financial info - But very few reported being suspicious of email
asking for passwords - Knowledge of financial phish reduced likelihood
of falling for these scams - But did not transfer to other scams, such as an
amazon.com password phish
12Naive Evaluation Strategies
- The most frequent strategies dont help much in
identifying phish - This email appears to be for me
- Its normal to hear from companies you do
business with - Reputable companies will send emails
- I will probably give them the information that
they asked for. And I would assume that I had
already given them that information at some point
so I will feel comfortable giving it to them
again.
13Summary of Findings
- People generally not good at identifying scams
they havent specifically seen before - People dont use good strategies to protect
themselves - Currently running large-scale survey across
multiple cities in the US to gather more data - Amazon also active in looking for fake domain
names
14Our Multi-Pronged Approach
- Human side
- Interviews to understand decision-making
- PhishGuru embedded training
- Anti-Phishing Phil game
- Understanding effectiveness of browser warnings
- Computer side
- PILFER email anti-phishing filter
- CANTINA web anti-phishing algorithm
Can we train people not to fall for phish?
15Web Site Training Study
- Laboratory study of 28 non-expert computer users
- Asked participants to evaluate 20 web sites
- Control group evaluated 10 web sites, took 15 min
break to read email or play solitaire, evaluated
10 more web sites - Experimental group same as above, but spent 15
min break reading web-based training materials - Experimental group performed significantly better
identifying phish after training - Less reliance on professional-looking designs
- Looking at and understanding URLs
- Web site asks for too much information
People can learn from web-based training
materials, if only we could get them to read
them!
16How Do We Get People Trained?
- Most people dont proactively look for training
materials on the web - Companies send security notice emails to
employees and/or customers - We hypothesized these tend to be ignored
- Too much to read
- People dont consider them relevant
- People think they already know how to protect
themselves - Led us to idea of embedded training
17Embedded Training
- Can we train people during their normal use of
email to avoid phishing attacks? - Periodically, people get sent a training email
- Training email looks like a phishing attack
- If person falls for it, intervention warns and
highlights what cues to look for in succinct and
engaging format - P. Kumaraguru, Y. Rhee, A. Acquisti, L. Cranor,
J. Hong, and E. Nunge. Protecting People from
Phishing The Design and Evaluation of an
Embedded Training Email System. CHI 2007.
18Embedded training example
Subject Revision to Your Amazon.com Information
Please login and enter your information
http//www.amazon.com/exec/obidos/sign-in.html
19Intervention 1 Diagram
20Intervention 1 Diagram
Explains why they are seeing this message
21Intervention 1 Diagram
Explains what a phishing scam is
22Intervention 1 Diagram
Explains how to identify a phishing scam
23Intervention 1 Diagram
Explains simple things you can do to protect self
24Intervention 2 Comic Strip
25Embedded Training Evaluation 1
- Lab study comparing our prototypes to standard
security notices - EBay, PayPal notices
- Intervention 1 Diagram that explains phishing
- Intervention 2 Comic strip that tells a story
- 10 participants in each condition (30 total)
- Screened so we only have novices
- Go through 19 emails, 4 phishing attacks
scattered throughout, 2 training emails too - Role play as Bobby Smith at Cognix Inc
26Embedded Training Results
- Existing practice of security notices is
ineffective - Diagram intervention somewhat better
- Comic strip intervention worked best
- Statistically significant
- Combination of less text, graphics, story?
27Evaluation 2
- New questions
- Have to fall for phishing email to be effective?
- How well do people retain knowledge?
- Roughly same experiment as before
- Role play as Bobby Smith at Cognix Inc, go thru
16 emails - Embedded condition means have to fall for our
email - Non-embedded means we just send the comic strip
- Had people come back after 1 week
- Improved design of comic strip intervention
- To appear in APWG eCrime Researchers Summit (Oct
4-5 at CMU)
28(No Transcript)
29Results of Evaluation 2
- Have to fall for phishing email to be effective?
- How well do people retain knowledge after a week?
Correctness
30Results of Evaluation 2
- Have to fall for phishing email to be effective?
- How well do people retain knowledge after a week?
Correctness
31Anti-Phishing Phil
- A game to teach people not to fall for phish
- Embedded training focuses on email
- Our game focuses on web browser, URLs
- Goals
- How to parse URLs
- Where to look for URLs
- Use search engines for help
- Try the game!
- http//cups.cs.cmu.edu/antiphishing_phil
32Anti-Phishing Phil
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38Evaluation of Anti-Phishing Phil
- Test participants ability to identify phishing
web sites before and after training up to 15 min - 10 web sites before training, 10 after,
randomized order - Three conditions
- Web-based phishing education
- Printed tutorial of our materials
- Anti-phishing Phil
- 14 participants in each condition
- Screened out security experts
- Younger, college students
39Results
- No statistically significant difference in false
negatives among the three groups - Actually a phish, but participant thinks its not
- Unsure why, considering a larger online study
- Though game group had fewest false positives
40(No Transcript)
41(No Transcript)
42Our Multi-Pronged Approach
- Human side
- Interviews to understand decision-making
- PhishGuru embedded training
- Anti-Phishing Phil game
- Understanding effectiveness of browser warnings
- Computer side
- PILFER email anti-phishing filter
- CANTINA web anti-phishing algorithm
Do people see, understand, and believe web
browser warnings?
43Screenshots
Internet Explorer Passive Warning
44Screenshots
Internet Explorer Active Block
45Screenshots
Mozilla FireFox Active Block
46How Effective are these Warnings?
- We tested four conditions
- FireFox Active Block
- IE Active Block
- IE Passive Warning
- Control (no warnings or blocks)
- Shopping Study
- Setup some fake phishing pages and added to
blacklists - Users were phished after purchases
- Real email accounts and personal information
- Spoofing eBay and Amazon (2 phish/user)
- We observed them interact with the warnings
47How Effective are these Warnings?
48Improving Phishing Indicators
- Passive warning failed for many reasons
- Didnt interrupt the main task
- Wasnt clear what the right action was
- Looked too much like other ignorable warnings
- Now looking at science of warnings
- How to create effective security warnings
49Our Multi-Pronged Approach
- Human side
- Interviews to understand decision-making
- PhishGuru embedded training
- Anti-Phishing Phil game
- Understanding effectiveness of browser warnings
- Computer side
- PILFER email anti-phishing filter
- CANTINA web anti-phishing algorithm
Can we automatically detect phish emails?
50PILFER Email Anti-Phishing Filter
- Philosophy automate where possible, support
where necessary - Goal Create email filter that detects phishing
emails - Spam filters well-explored, but how good for
phishing? - Can we create a custom filter for phishing?
- I. Fette, N. Sadeh, A. Tomasic. Learning to
Detect Phishing Emails. In WWW 2007.
51PILFER Email Anti-Phishing Filter
- Heuristics combined in SVM
- IP addresses in link (http//128.23.34.45/blah)
- Age of linked-to domains (younger domains likely
phishing) - Non-matching URLs (ex. most links point to
PayPal) - Click here to restore your account
- HTML email
- Number of links
- Number of domain names in links
- Number of dots in URLs (http//www.paypal.update.e
xample.com/update.cgi) - JavaScript
- SpamAssassin rating
52PILFER Evaluation
- Ham corpora from SpamAssassin (2002 and 2003)
- 6950 good emails
- Phishingcorpus
- 860 phishing emails
53PILFER Evaluation
54PILFER Evaluation
- PILFER now implemented as SpamAssassin filter
- Alas, Ian has left for Google
55Our Multi-Pronged Approach
- Human side
- Interviews to understand decision-making
- PhishGuru embedded training
- Anti-Phishing Phil game
- Understanding effectiveness of browser warnings
- Computer side
- PILFER email anti-phishing filter
- CANTINA web anti-phishing algorithm
Can we do better in automaticallydetecting
phish web sites?
56Lots of Phish Detection Algorithms
- Dozens of anti-phishing toolbars offered
- Built into security software suites
- Offered by ISPs
- Free downloads
- Built into latest version of popular web browsers
- 132 on download.com
57Lots of Phish Detection Algorithms
- Dozens of anti-phishing toolbars offered
- Built into security software suites
- Offered by ISPs
- Free downloads
- Built into latest version of popular web browsers
- 132 on download.com
- But how well do they detect phish?
- Short answer still room for improvement
58Testing the Toolbars
- November 2006 Automated evaluation of 10
toolbars - Used phishtank.com and APWG as source of phishing
URLs - Evaluated 100 phish and 510 legitimate sites
- Y. Zhang, S. Egelman, L. Cranor, J. Hong.
Phinding Phish An Evaluation of Anti-Phishing
Toolbars. NDSS 2006.
59Testbed System Architecture
60Results
38 false positives
1 false positives
PhishTank
61APWG
62Results
- Only one toolbar gt90 accuracy (but high false
positives) - Several catch 70-85 of phish with few false
positives
63Results
- Only one toolbar gt90 accuracy (but high false
positives) - Several catch 70-85 of phish with few false
positives - Can we do better?
- Can we use search engines to help find phish?
- Y. Zhang, J. Hong, L. Cranor. CANTINA A
Content-Based Approach to Detecting Phishing Web
Sites. In WWW 2007.
64Robust Hyperlinks
- Developed by Phelps and Wilensky to solve 404
not found problem - Key idea was to add a lexical signature to URLs
that could be fed to a search engine if URL
failed - Ex. http//abc.com/page.html?sigword1word2...
word5 - How to generate signature?
- Found that TF-IDF was fairly effective
- Informal evaluation found five words was
sufficient for most web pages
65Adapting TF-IDF for Anti-Phishing
- Can same basic approach be used for
anti-phishing? - Scammers often directly copy web pages
- With Google search engine, fake should have low
page rank
Fake
Real
66How CANTINA Works
- Given a web page, calculate TF-IDF score for
each word in that page - Take five words with highest TF-IDF weights
- Feed these five words into a search engine
(Google) - If domain name of current web page is in top N
search results, we consider it legitimate - N30 worked well
- No improvement by increasing N
- Later, added some heuristics to reduce false
positives
67Fake
eBay, user, sign, help, forgot
68Real
eBay, user, sign, help, forgot
69(No Transcript)
70(No Transcript)
71Evaluating CANTINA
PhishTank
72Summary
- Whirlwind tour of our work on anti-phishing
- Human side how people make decisions, training,
UIs - Computer side better algorithms for detecting
phish - More info about our work at cups.cs.cmu.edu
73Acknowledgments
- Alessandro Acquisti
- Lorrie Cranor
- Sven Dietrich
- Julie Downs
- Mandy Holbrook
- Norman Sadeh
- Anthony Tomasic
- Serge Egelman
- Ian Fette
- Ponnurangam Kumaraguru
- Bryant Magnien
- Elizabeth Nunge
- Yong Rhee
- Steve Sheng
- Yue Zhang
Supported by NSF, ARO, CyLab, Portugal Telecom
74http//cups.cs.cmu.edu/
CMU Usable Privacy and Security Laboratory
75(No Transcript)
76Embedded Training Results
77Is it legitimate
Our label Our label
Yes No
Yes True positive False positive
No False negative True negative
78(No Transcript)
79(No Transcript)
80(No Transcript)
81Minimal Knowledge of Lock Icon
- I think that it means secured, it symbolizes
some kind of security, somehow. - 85 of participants were aware of lock icon
- Only 40 of those knew that it was supposed to
be in the browser chrome - Only 35 had noticed https, and many of those
did not know what it meant