Exploring patterns in website content structure - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Exploring patterns in website content structure

Description:

16% of the global population went online (Internet World ... site as their communication artifact (genre theory, Orlikowski and Yates, 1994; Swales, 1990) ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 21
Provided by: tac55
Category:

less

Transcript and Presenter's Notes

Title: Exploring patterns in website content structure


1
Exploring patterns in website content structure
  • Svetlana Symonenko
  • School of Information Studies, Syracuse
    University
  • IA Summit 2006
  • Vancouver, March 25, 2006

2
Project Background (How Come..)
  • looking for patterns
  • self-guided tour preferred..

3
Project background
  • Problem space
  • 16 of the global population went online
    (Internet World Statistics, 2005)
  • 21 of users find the information gt 80 of the
    time (Feldman, 2004)
  • search most common mode of finding information,
    but most natural?
  • Current state of research
  • notion of conventionalization of websites
    content organization, driven by industry
    practices
  • empirical evidence
  • users developed expectations of sites external
    design and content structure
  • research focus
  • on websites visual design, with insufficient
    attention to the organization of websites
    content

4
Scope of the study
  • Signs of conventionalization in the observable
    structure of website content
  • Website - an integrated information package
    created by the site sponsor with a particular
    audience in mind
  • sponsor audience sites discourse community,
    shaping the site as their communication artifact
    (genre theory, Orlikowski and Yates, 1994
    Swales, 1990)
  • Two perspectives / study phases

5
Data Collection Dataset
  • 15 websites of 3 types
  • universities (EDU) - U.S. colleges
  • business (COM) - telecommunications companies
  • government (GOV) - state portals
  • Perl-based offline browser
  • collected elements of sites content structure
  • Page Title, Link URL Link Label, Level
  • traversed sites in a breadth-first manner
  • restrictions applied
  • top 3 levels only
  • within the original domain
  • excluded links to dynamically generated pages

6
Data Collection Dataset
  • Resulting dataset

7
Data Analysis
  • units of analysis - Page Title, Link URL, Link
    Label, Level
  • analytic induction approach
  • focus on distribution of content categories
  • at different levels within the site
  • for particular kinds of links (navigational) and
    pages (Home, About)
  • conventionalization degrees (Nielsen, 2004)
    applied to content categories
  • standard category present on gt 80 of websites
  • conventional category - on 50-79 of websites
  • unconventional category on lt 49 of websites
  • also looked at naming conventions for particular
    categories

8
Results genre-related patterns in the sites
content structure ?
  • Navigational links
  • about 1/5 of all links, regardless the site type
  • home is the most common link at all site types
  • genre-specific importance of some categories

9
Patterns more.. Homepage patterns
  • EDU and GOV vs. COM gtgt diverse content
  • universal and type-dependent categories

10
Content structure of selected pages About EDU
vs. COM
  • pages of the same kind, but on sites of different
    types
  • About page, standard categories

11
Visit page EDU vs. GOV, standard categories
12
Type-dependent profile of site content
structure EDU
13
Category-specific lexicons About
  • About
  • About Us
  • About Entity_Name

14
Category-specific lexicons Employment_informatio
n
  • Employment (Career, Job) Opportunities
  • Careers
  • Jobs

15
Type-related features in overall site architecture
  • EDU and GOV sites - portals to sites of
    individual departments
  • institutional identity at the top 2-3 levels,
    then breaks into kaleidoscope of departments
  • COM sites
  • much more uniform look throughout the site
  • product- or customer-focused architecture
  • very few links leading off the site
  • relatively shallow (at most, 3-4 levels deep)

16
Type-related features in overall site architecture
  • EDU, GOV much more verbose than COM
  • average number of links per page, by site type
  • observations confirmed on a larger data set (123
    EDU 73 COM)
  • lt 0.5 MB - 89 of COM sites, 7 of EDU sites

17
Future research
  • Analysis of content structure of a larger sample
    of EDU and COM sites
  • Navigational links EDU vs. COM

5 EDU and 5 COM (Pilot)
28 EDU and 27 COM
18
Future research
  • Homepage EDU vs. COM

28 EDU and 27 COM
5 EDU and 5 COM (Pilot)
19
Yet More Future ?
  • User Study

20
Thank You !
Write a Comment
User Comments (0)
About PowerShow.com