Title: UH Showcase Poster
1Software agent for locating and analyzing virtual
communities on the world wide web
PI Co-PI Sun-Ki Chai, David Chin
Scott Robertson Graduate
Assistant Kar-Hai Chu, Aaron Herres Dong-Wan
Kang United States Patent 7499965
by Sun-Ki Chai Sponsored by AFOR
We are searching for communities, not individual
pages
Algorithm The search algorithm uses a patented
crawling algorithm (United States Patent 7499965)
based on social science network theories and
content analysis to automatically locate a
virtual community of interest to the user,
identify key sites within the community, and
determine content patterns that characterize the
communications of key sites.
Figure above illustrating links connecting pages
within distinct communities
Possible Applications
- Link-based analysis
- Content-based analysis
- Finding which combination of content, network,
and offline characteristics are most significant
in determining a sites popularity and influence - Determining the characteristics of the community
who use particular online resource
identification of opinion leaders predicting
their actions - Developing a reverse search engine that
identifies the key words and phrases that most
distinctively identify a particular virtual
community - Determining the attitudes and characteristics of
real-world communities by analyzing their
corresponding virtual communities
Software Our software builds on the successful
open source crawler Nutch. We use Nutchs
crawling and indexing in conjunction with our
customized data structures site management to
efficiently find relevant community members.
2Estimating Groupness of Virtual Communities
Customize Search
- User-Defined Groupness measures
- Combine content and link analysis
Given a virtual community, how can we estimate
the groupness? By community metrics, content
analysis, and analysis of member networks.
- Community Metrics
- Mean Reply Depth (activity level)
- First response time / Answer rate
- Non-lurker Rate (participation)
Communities Found Watch as communities are
discovered in real time
- Content Analysis
- Pronoun usage 1st 3rd person plural pronoun
- Analysis of Membership Networks
- Centrality Measurement
Graphic View Discover and visualize complex
community relationships