Connected Social Network - PowerPoint PPT Presentation

About This Presentation
Title:

Connected Social Network

Description:

by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan OUTLINE Introduction Data mining Vs Web mining Web mining subtasks Challenges Taxonomy Web content mining Web ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 32
Provided by: wwwpowerpo106
Category:

less

Transcript and Presenter's Notes

Title: Connected Social Network


1
WEB MINING by NINI P SURESH
PROJECT CO-ORDINATOR Kavitha Murugeshan
2
OUTLINE
  • Introduction
  • Data mining Vs Web mining
  • Web mining subtasks
  • Challenges
  • Taxonomy
  • Web content mining
  • Web structure mining
  • Web usage mining
  • Applications

3
INTRODUCTION
Nowadays, it has become necessary for users to
utilise automated tools to find, extract, filter
evaluate desired information resources. The
target of search engines is only to discover the
resources on the web.
4
INTRODUCTION
  • Needs for Web Mining
  • Narrowly searching scope
  • Low precision

5
INTRODUCTION
  • Other Approaches
  • Database approach (DB)
  • Information retrieval
  • Natural language processing (NLP)
  • Web document community

6
WEB MINING DEFENITION
Web mining refers to the overall process of
discovering potentially useful and previously
unknown information or knowledge from the Web
data.
7
DATA MINING WEB MINING
  • Extracting relevant information hidden in
    Web-related data, like hypertext documents on web
  • Extraction of useful patterns from data sources
    like databases, texts, web, images etc

8
WEB MINING SUBTASKS
  • Resource finding
  • Information selection preprocessing
  • Generalization
  • Analysis

9
CHALLENGES
  • Search relevant information on web
  • Create knowledge
  • Personalization of Information
  • Learn patterns
  • Uniformity standardisation

10
CHALLENGES
  • Redundant Information
  • Noisy web
  • Monitoring changes
  • Sites providing Services
  • Privacy

11
TAXONOMY
Web Mining
Web Structure Mining
Web Content Mining
Web Usage Mining
Web Text Mining
Web Multimedia Mining
Personalized Usages Track
Gen. Access Pattern Track
Link Mining
URL Mining
Internal Structure Mining
12
WEB CONTENT MINING
  • Discovering useful information Analyses the
    content
  • Automatic process beyond keyword extraction
  • Approaches to restructure document content
  • Two groups of mining strategies

13
WEB CONTENT MINING
  • Agent based Approach
  • Intelligent search agents
  • Information filtering/categorization
  • Personalized web agents

14
WEB CONTENT MINING
  • Database Approach
  • Multilevel databases
  • Web query system

15
WEB STRUCTURE MINING
  • Discovering structure information from web
  • Web graph web pages as nodes hyperlinks as
    edges

16
WEB STRUCTURE MINING
  • Two algorithms for handling of links
  • PageRank
  • HITS

17
WEB STRUCTURE MINING
  • PageRank
  • Metric for ranking hypertext documents
  • Depends on rank of pages pointing it
  • Iterative process

18
WEB STRUCTURE MINING
n Number of nodes in graph Outdegree(q)
Number of hyperlinks on page q d damping factor

19
WEB STRUCTURE MINING
  • HITS
  • Iterative algorithm
  • Identify topic hubs authorities
  • Input search results returned by traditional
    text indexing technique

20
WEB STRUCTURE MINING
  • Assigns weight to hub based on authoritiveness
  • Outputs pages with largest hub authority
    weights

21
WEB USAGE MINING
  • Extracting information from server logs
  • Discover user access patterns of Web pages
  • Decomposed into 3 subtasks

Site Files
Preprocessing
Mining algorithms
Pattern Analysis
Interesting Rules, Patterns Statistic
Rules, Patterns Statistic
User session file
Raw logs
22
WEB USAGE MINING
  • Preprocessing
  • Data cleaning
  • User identification
  • User sessions identification
  • Access path supplement
  • Transaction identification

23
WEB USAGE MINING
  • Pattern discovery
  • Statistical Analysis
  • Association Rules
  • Clustering analysis

24
WEB USAGE MINING
  • Classification analysis
  • Sequential Pattern
  • Dependancy Modeling

25
WEB USAGE MINING
  • Pattern Analysis
  • Eliminates irrelevant rules or patterns
  • Extract intresting patterns

26
APPLICATIONS
  • Personalized Services
  • Improve website design
  • System Improvement
  • Predicting trends
  • Carry out intelligent buisness

27
PROS
  • High trade volumes
  • Classify threats fight against Terrorism
  • Establish better customer relationship
  • Increase profitability

28
CONS
  • Invasion of Privacy
  • Discrimination by controversial attributes

29
CONCLUSION
  • Rapidly growing area
  • Promising area of future research

30
REFERENCE

1 http//en.wikipedia.org/wiki/Web mining 2
http//www.galeas.de/webimining.html 3 Jaideep
srivastava, Robert Cooley, Mukund Deshpande,
Pan-Ning Tan, Web Usage Mining Discovery and
Applications of Usage Patterns from Web Data,
SIGKDD Explorations, ACM SIGKDD,Jan 2000. 4
Miguel Gomes da Costa Jnior,Zhiguo Gong, Web
Structure Mining An Introduction, Proceedings of
the 2005 IEEE International Conference on
Information Acquisition 5 R. Cooley, B.
Mobasher, and J. Srivastava,Web Mining
Information and Pattern Discovery on the World
Wide Web, ICTAI97 6 Brijendra Singh, Hemant
Kumar Singh, WEB DATA MINING RE- SEARCH A
SURVEY, 2010 IEEE 7 Mining the Web discovering
knowledge from hypertext data, Part 2 By Soumen
Chakrabarti, 2003 edition 8 Web mining
applications and techniques By Anthony Scime
31
WEB MINING
Thank You
Write a Comment
User Comments (0)
About PowerShow.com