Online Data Collection - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Online Data Collection

Description:

Cookies can also be used to track a user's movements on the web and possibly to ... Provide a free service and in the process ... collect and sell personal data ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 35
Provided by: donkr
Category:

less

Transcript and Presenter's Notes

Title: Online Data Collection


1
Online Data Collection
Web Systems Architecture Protocols UCSC Santa
Cruz Extension Don Krafft don_at_donkrafft.com Jin
g Luan Ph.D jing_at_cabrillo.cc.ca.us
2
Overview
  • Online Data Collection
  • Introduction
  • Voluntary Collection
  • Online Forms
  • Data Gathering
  • Data Manipulation
  • Data Usage
  • Involuntary Collection
  • Cookies
  • Web Bugs
  • Spyware
  • Solutions

3
Introduction
  • The web is a source of information
  • In order to exploit information, we sometimes
    have to provide it
  • Information provided is often the cost of using
    the internet
  • Information is used to provide services
  • Information is used for marketing purposes to
    generate income
  • Information collection is a major portion of the
    web activities that warrants a special study

4
Part I
5
Voluntary Section
  • Voluntary Collection
  • Data Gathering
  • Online Forms
  • Data Manipulation
  • Data Usage
  • Scope of this presentation

6
Modes of Voluntary Data Collection
  • Types of online voluntary data collection
  • e-commerce (Travel, books, eBay)
  • Membership (AOL, Earthlink, Portal sites)
  • Survey (product follow-up, CRM, freebees)
  • Services (DMV, library)
  • Web surfing/FTP
  • others

7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
Form Technology
Data Holding Medium
Internet
?
db
?
?
  • Web
  • Design Tools
  • - FrontPage
  • - Dreamweaver
  • - ColdFusion
  • Web Server
  • NT
  • Unix
  • CGI
  • FrontPage Extensions

12
(No Transcript)
13
Online Data Usage
  • CRM (how did we do?)
  • Profiling (what you are likely to buy)
  • Marketing (what appears in your email else)
  • Info. Sharing (beyond mailing list)
  • Tracking (related to profiling and marketing)
  • Legal/Criminal (digital trail)
  • Removal of Barriers (Hello, welcome back!)

14
Part II
15
Involuntary Data Collection
  • Involuntary Data Collection
  • Cookie files
  • Web Bugs
  • Spyware
  • Protecting Yourself
  • Easy
  • More advanced
  • Scope is intrusive but not malicious data
    collection
  • This briefing deals primarily with MS Windows

16
Dubious Connections
  • The data collection game is getting aggressive
  • Online advertising is not lucrative and personal
    information collection is
  • The data collectors
  • have access to your computer
  • can have access to your systems registry
  • can download and execute software on your
    computer
  • can correlate data collected online with offline
    databases
  • are aggressive
  • Persistence and common sense will provide
    reasonable protection from undue intrusion

17
Involuntary Data Collection
If they do it without telling you about it then
theres probably a privacy concern (theres a
reason they dont tell you)
18
Cookie Files
  • What is a cookie?
  • A cookie (or persistent cookie) is a small data
    file created by a Web server that, with a
    Browsers cooperation, is stored on a user's
    computer
  • Cookies can contain personal information as well
    as users Web site preferences
  • The cookies contain a range of URLs for which
    they are valid.
  • How are cookies used?
  • Cookies provides a way for the Web site to keep
    track of a user's patterns and preferences
  • A cookie could, for example, save a person from
    typing in the same password and other information
    all over again on subsequent site visits
  • When the browser encounters URLs for which it has
    cookies, it sends those specific cookies to the
    Web server
  • Cookies can also be used to track a users
    movements on the web and possibly to share
    personal information with third parties

19
Cookie Files
  • How are cookies set?
  • Set-Cookie NAMEVALUE expiresDATEpathPATH
    domainDOMAIN_NAME secure
  • Example, setting and picking up a cookie?
  • Client requests a document, and receives in the
    response from the server
  • Set-Cookie CUSTOMERWILE_E_COYOTE path/
    expiresWednesday, 09-Nov-99 231240 GMT
  • When client requests a URL in path "/" on this
    server, the client sends
  • Cookie CUSTOMERWILE_E_COYOTE
  • Example from www.netscape.com

20
Cookie Files
  • Scenario 1
  • Client visits Site A
  • Marketing Site B is called to place an ad on a
    page and at that time places a cookie on the
    client for later retrieval
  • Marketing Site B picks up the cookie whenever any
    of its partner sites are visited, thus tracking
    the client
  • Scenario 2
  • Site A collects clients personal data
    voluntarily
  • Site A places a cookie on client addressed to
    Site B
  • Called a 3rd party or illegal cookie (violation
    of cookie spec)
  • Site B is called to place an ad and picks up the
    cookie containing personal information

21
Index.dat
  • Internet Explorer maintains a Browser Index file
  • Its a binary file that contains the users
    cookies (or a summary of them) and a history of
    URLs visited
  • This file retains the data after the cookies have
    been deleted
  • This file cannot be deleted from within Windows
    (can be deleted from DOS)
  • Microsoft tech support pages contain no
    information about it
  • I cant determine who has read-access to this file

22
Browser Data
  • Whats available to any site from your web
    browser
  • IP address (this may be dynamic)
  • Time
  • Last site visited
  • Next site visited
  • Browser type
  • Operating system

23
Web Bugs
  • Web bugs are embedded objects that are loaded
    from a third-party source
  • Web bugs can track your movements on the web
  • Each time a graphic or other object is called
    from a web bug site by a web page, the Browser
    information is retrieved
  • A source, such as an advertiser can keep track of
    each user or computer that loads the image
  • Advertisers serving a significant number of the
    popular web sites will have access to your
    movements about the web
  • Much of this information is available from Web
    Server log files
  • But advertisers would need the servers log files
  • And this is easier and automated
  • A site that collects voluntary information may
    share it with affiliates

24
Web Bugs
  • Examples of web bugs (HTML Source Code)
  • Generic example
  • ltIMG SRChttp//ad.doubleclick.net/activitysrc3
    28142typemmticatinvstrordltTimegt? WIDTH1
    HEIGHT1 BORDER0gt
  • www.investorplace.com
  •  ltIMG SRC"http//ad.doubleclick.net/ad/investorpl
    ace.com/sz127x155tile3ord982372051"
    border0 height"155" width"127"gtlt/Agt

25
Web Bugs
  • Request for Service
  • Personal Data

B
WS-1
  • Service
  • Reference SRCWS-2
  • WS-1 Cookie
  • WS-1 Cookie data?
  • Other data


?
  • Graphic
  • WS-2 Cookie

WS-2
  • Request for graphic
  • Browser Data
  • Tracking Information
  • Third-party cookie?

26
Web Bugs
  • Web Bugs can be in any type of document
  • Web pages
  • Email
  • Any document that can have a third-party reference

27
Spyware
  • What is Spyware?
  • Spyware is any software which employs a user's
    Internet connection in the background without the
    users knowledge or explicit permission
  • Steve Gibson

28
Spyware
  • It is classified as Spyware if it
  • is not explained in advance to the user
  • gathers information not expressly required for a
    requested service
  • introduces insecurities (eg Aureate promotes
    their ability to secretly download and execute
    third party programs)
  • uses data in ways that are not clearly defined in
    the Privacy Statement
  • does not register with the Operating System and
    is not removable with the Add/Remove facility

29
Spyware
Radiate's Privacy Disclosure Radiate has
developed and distributed technology which allows
Radiate to send advertising to computers which
have downloaded Radiate's technology as part of
larger programs Radiate's technology exchanges
information with a user's computer, and as part
of this exchange, Radiate collects certain,
nonspecific data about its users and aggregates
its data Radiate is very concerned with the
privacy of its users and, therefore, never sells
data it collects, never collects data uniquely
identifiable without the user's knowledge, and
never combines user-specific data with the
generic data it collects Radiate has taken
steps to ensure that the owners and distributors
of the shareware programs containing Radiate's
codes provide full disclosure of the functioning
of Radiate's codes
30
Junos Service Agreement!
  • You expressly permit and authorize Juno (ISP) to
  • (i) download to your computer one or more pieces
    of software designed to perform computations,
    which may be unrelated to the operation of the
    Service
  • (ii) run the Software on your computer to perform
    and store the results of such computations
  • (iii) upload such results to Juno's computers
    during a subsequent connection, whether initiated
    by you in the course of using the Service or by
    the Software
  • ... you agree not to take any action to disable
    or interfere with the operation of the Software
  • you agree to run the computer continuously and
    to pay any phone charges associated with
    uploading results

31
Privacy Statements
  • Privacy Statements
  • Explain sites general data collection scenarios
  • can be changed at will
  • are non-binding
  • can be abandoned
  • When advertising revenue is down
  • When a dot-com goes out of business
  • do not reveal the use of web bugs
  • never reveal that personal information is sold
    for profit
  • Sometimes reveal that that user data is collected
    and shared and is not secure
  • TRUSTe is an industry group that recommends
    content of Privacy Statements but does not
    discourage the collection and sale of personal
    data
  • Good privacy policy statements are short and
    dont start with
  • Your privacy is important to us

32
Purpose of Data Collection
  • Involuntary personal data is collected in order
    to
  • Profile individual users and track movements
    about the Internet
  • Improve site content
  • Provide customized services
  • Billing and shipping
  • Target advertising
  • Develop a marketable personal information
    database
  • Personal preferences
  • Demographic data
  • Profit in an competitive environment
  • Primary Internet business is often not profitable
  • Provide a free service and in the process
  • collect and sell personal data

33
Solutions
  • The easy stuff
  • Open a junk email account (eg Hotmail)
  • Install and configure a Layer 7 firewall (eg
    ZoneAlarm)
  • programs phoning home
  • Manage your Cookies (eg 12Ghosts)
  • and your Index.dat file
  • Scan for Spyware (eg Ad-aware)
  • Clean your registry (eg Jouni Vuorios RegEdit)
  • Dont read Spam
  • Dont click on banner ads
  • Visit newsletter sites
  • rather than receiving email newsletters

34
Solutions
  • Taking it a step further
  • Surf through a proxy service (eg websafe)
  • Install a router (eg Linksys, Netgear)
  • Hardware Layer 3 firewall
  • Anonymous IP address
  • Also keeps your DSL or cable modem up
Write a Comment
User Comments (0)
About PowerShow.com