Reverse Engineering to Achieve Maintainable WWW Sites - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Reverse Engineering to Achieve Maintainable WWW Sites

Description:

Web is a huge information base and storage medium ... Large web sites exhibit many of the problems of large scale software systems ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 13
Provided by: janetl3
Category:

less

Transcript and Presenter's Notes

Title: Reverse Engineering to Achieve Maintainable WWW Sites


1
Reverse Engineering to Achieve Maintainable WWW
Sites
  • Cornelia Boldyreff and Richard Kewish
  • RISE
  • Department of Computer Science
  • University of Durham
  • cornelia.boldyreff_at_durham.ac.uk
  • www.dur.ac.uk/cornelia.boldyreff

2
Overview
  • Background
  • Maintenance Problems Associated with the WWW
  • Reverse Engineering of Web Sites
  • Results Obtained
  • So what and where next...

3
Background
  • Web is a huge information base and storage medium
  • Rigorous development and maintenance principles
    have not always been applied to web sites
  • Speed and breadth of its growth
  • Its inherent complexity
  • Support where available is focused on development
    rather than maintenance

4
Maintenance Problems with the WWW
  • Large web sites exhibit many of the problems of
    large scale software systems - as well
    incorporating legacy systems, the web has its own
    legacy aspects.
  • Specific problems with web sites
  • broken links
  • incorrect or out-of-date data
  • inconsistent information
  • inconsistent style

5
Root Causes of Problems
  • Forced Duplication of files and/or data
  • HTML lacks an include directive - designer must
    choose between linking pages or copying data
  • The file structure of web sites
  • Directories can be used to modularise a site,
    but this is difficult to achieve as a site grows
    and files become interlinked and distributed over
    servers.
  • HTML blurs the distinction between data and code

6
Reverse Engineering of Web Sites
  • Databases can be used in conjunction with the Web
  • to incorporate existing company data
  • to overcome some of the problems cited, e.g. for
    link storage
  • Considerable redevelopment of existing pages may
    be necessary
  • Database maintenance and web site maintenance
    must be co-ordinated

7
Why and How - RE
  • Older sites are likely to require RE to ensure
    replicated elements within a site are identified,
    abstracted, and moved to a DB or subsumed into a
    scripting language program.
  • The approach taken here involves parsing and
    analysing web page content and style, a
    rationalised copy of each unique HTML data
    element is stored in a repository either directly
    or by reference.

8
System Overview
  • 1 - HTML File Parser and Parser Tree Analyser
  • 2 - Head Analyser
  • 3 - Style Sheet Analyser
  • 4 - Body Analyser
  • 5- Data Storage and Dynamic Page Generation

1
3
2
4
5
9
Implementation Details
  • Various DB tables are used - large items stored
    by references to files
  • Extra data added - date of last up-date and
    person responsible
  • Any changes to DB are recorded in audit files
  • Log files detailing web page parsing, analysis
    and storage are produced along with an overall
    analysis summary
  • Web page is dynamically recreated from a new web
    page file - its tags contain embedded script
    statements

10
Some Results Obtained
11
So what and where next...
  • There is duplication in web sites and it can be
    identified and removed
  • A more rational re-structuring and storage of web
    site contents is achievable
  • In principle this can facilitate future
    maintenance - but more studies are needed
  • Integration with a popular development could lead
    to more widespread usage and allow more in-depth
    evaluation
  • More sophisticated clone analysis can be
    considered building on this experimental system

12
Key Points
  • Web developers and maintainers can learn from
    Software Engineering.
  • Techniques from Reverse Engineering can be
    applied to web sites.
  • Better abstractions and structuring principles
    are needed for web site development in the future
    - but there will remain a role for RE as
    described here.
Write a Comment
User Comments (0)
About PowerShow.com