Title: Connected Social Network
1WEB MINING by NINI P SURESH
PROJECT CO-ORDINATOR Kavitha Murugeshan
2OUTLINE
- Introduction
- Data mining Vs Web mining
- Web mining subtasks
- Challenges
- Taxonomy
- Web content mining
- Web structure mining
- Web usage mining
- Applications
3INTRODUCTION
Nowadays, it has become necessary for users to
utilise automated tools to find, extract, filter
evaluate desired information resources. The
target of search engines is only to discover the
resources on the web.
4INTRODUCTION
- Needs for Web Mining
- Narrowly searching scope
- Low precision
5INTRODUCTION
- Other Approaches
- Database approach (DB)
- Information retrieval
- Natural language processing (NLP)
- Web document community
6WEB MINING DEFENITION
Web mining refers to the overall process of
discovering potentially useful and previously
unknown information or knowledge from the Web
data.
7DATA MINING WEB MINING
- Extracting relevant information hidden in
Web-related data, like hypertext documents on web
- Extraction of useful patterns from data sources
like databases, texts, web, images etc
8WEB MINING SUBTASKS
- Resource finding
- Information selection preprocessing
- Generalization
- Analysis
9CHALLENGES
- Search relevant information on web
- Create knowledge
- Personalization of Information
- Learn patterns
- Uniformity standardisation
10CHALLENGES
- Redundant Information
- Noisy web
- Monitoring changes
- Sites providing Services
- Privacy
11TAXONOMY
Web Mining
Web Structure Mining
Web Content Mining
Web Usage Mining
Web Text Mining
Web Multimedia Mining
Personalized Usages Track
Gen. Access Pattern Track
Link Mining
URL Mining
Internal Structure Mining
12WEB CONTENT MINING
- Discovering useful information Analyses the
content - Automatic process beyond keyword extraction
- Approaches to restructure document content
- Two groups of mining strategies
13WEB CONTENT MINING
- Agent based Approach
- Intelligent search agents
- Information filtering/categorization
- Personalized web agents
14WEB CONTENT MINING
- Database Approach
- Multilevel databases
- Web query system
15WEB STRUCTURE MINING
- Discovering structure information from web
- Web graph web pages as nodes hyperlinks as
edges
16WEB STRUCTURE MINING
- Two algorithms for handling of links
- PageRank
- HITS
17WEB STRUCTURE MINING
- PageRank
- Metric for ranking hypertext documents
- Depends on rank of pages pointing it
- Iterative process
18WEB STRUCTURE MINING
n Number of nodes in graph Outdegree(q)
Number of hyperlinks on page q d damping factor
19WEB STRUCTURE MINING
- HITS
- Iterative algorithm
- Identify topic hubs authorities
- Input search results returned by traditional
text indexing technique
20WEB STRUCTURE MINING
- Assigns weight to hub based on authoritiveness
- Outputs pages with largest hub authority
weights
21WEB USAGE MINING
- Extracting information from server logs
- Discover user access patterns of Web pages
- Decomposed into 3 subtasks
Site Files
Preprocessing
Mining algorithms
Pattern Analysis
Interesting Rules, Patterns Statistic
Rules, Patterns Statistic
User session file
Raw logs
22WEB USAGE MINING
- Preprocessing
- Data cleaning
- User identification
- User sessions identification
- Access path supplement
- Transaction identification
23WEB USAGE MINING
- Pattern discovery
- Statistical Analysis
- Association Rules
- Clustering analysis
24WEB USAGE MINING
- Classification analysis
- Sequential Pattern
- Dependancy Modeling
25WEB USAGE MINING
- Pattern Analysis
- Eliminates irrelevant rules or patterns
- Extract intresting patterns
26APPLICATIONS
- Personalized Services
- Improve website design
- System Improvement
- Predicting trends
- Carry out intelligent buisness
27PROS
- High trade volumes
- Classify threats fight against Terrorism
- Establish better customer relationship
- Increase profitability
28CONS
- Invasion of Privacy
- Discrimination by controversial attributes
29CONCLUSION
- Rapidly growing area
- Promising area of future research
30REFERENCE
1 http//en.wikipedia.org/wiki/Web mining 2
http//www.galeas.de/webimining.html 3 Jaideep
srivastava, Robert Cooley, Mukund Deshpande,
Pan-Ning Tan, Web Usage Mining Discovery and
Applications of Usage Patterns from Web Data,
SIGKDD Explorations, ACM SIGKDD,Jan 2000. 4
Miguel Gomes da Costa Jnior,Zhiguo Gong, Web
Structure Mining An Introduction, Proceedings of
the 2005 IEEE International Conference on
Information Acquisition 5 R. Cooley, B.
Mobasher, and J. Srivastava,Web Mining
Information and Pattern Discovery on the World
Wide Web, ICTAI97 6 Brijendra Singh, Hemant
Kumar Singh, WEB DATA MINING RE- SEARCH A
SURVEY, 2010 IEEE 7 Mining the Web discovering
knowledge from hypertext data, Part 2 By Soumen
Chakrabarti, 2003 edition 8 Web mining
applications and techniques By Anthony Scime
31WEB MINING
Thank You