Title: Promoting Your Project Web Site
1Promoting Your Project Web Site
- Brian Kelly
- UK Web Focus
- UKOLN
- University of Bath
- Bath, BA2 7AY
- England
Email B.Kelly_at_ukoln.ac.uk URL http//www.ukoln.ac.
uk/ Project Manager for Exploit Interactive web
magazine http//www.exploit-lib.org/
UKOLN is funded by the Library and Information
Commission, the Joint Information Systems
Committee (JISC) of the Higher Education Funding
Councils, as well as by project funding from the
JISC and the European Union. UKOLN also
receives support from the University of Bath
where it is based.
2Approaches
- What approaches can we take to raising the
profile of our web site? - Tell our friends and colleagues (at conferences
in exotic places) - Give away pens and bags
- Let it happen automatically
- Submitting resources
- Perhaps giving parts of our web site away?
3Automated Indexing
- Many users use search engines such as AltaVista,
HotBot, Northern Lights, etc. to find resources. - Issues
- Will my site be indexed?
- Will it be near the top of a sensible search
query? - How can I improve things?
4Problems in Being Indexed
- Size of Index
- Search engines are failing to keep up with the
growth of the web - Not all pages on a web site will be indexed
- Typically a 500 page sample will be indexed
- Frames (and "splash screens")
- Many indexing robots can't access framed web
sites or web sites which use "splash screens"
5Improving Indexing of Key Resources
- How to ensure that quality pages are indexed
- Don't publish non-work pages on the server
- Move from a single large institutional server to
multiple (real or virtual) servers - Instead of ltwww.ukoln.ac.uk/exploit/gt use
ltexploit.ukoln.ac.uk/gt or (even better)
ltexploit-lib.org/gt - Avoid use of frames (or provide link to
alternative entry point) - These approaches will improve chances of more
complete indexing of the web site
6Improving Indexing (2)
- Do you know if your project web sites uses the
Robot Exclusion Protocol (REP) - a /robots.txt
file? - Use the REP to
- Prevent junk (old or draft versions,
experimentation, etc) from being indexed - Check your /robots.txt file to
- Ensure that your web site can be indexed
User-agent Following apply to all robots
Disallow /cgi-bin/ Don't index /cgi-bin
directory Disallow /tmp/ Don't index /tmp
directory
Tools are available to help you manage the
robots.txt file. For example RoboGen
lthttp//www.rietta.com/robogen/gt
7Improving Indexing (3)
- Updating the /robots.txt file may be difficult.
- The (new) ltMETAgt feature allows HTML authors to
control robots. - Use this in key menu pages for resources you
don't want indexed.
ltMETA NAME"robots" CONTENT"noindex, nofollow"gt
reports
deliverables
draft
personal
See lthttp//info.webcrawler.com/mak/projects/robo
ts/meta-user.htmlgt and lthttp//www.kollar.com/robo
ts.htmlgt
8Some Solutions (3)
- Getting Your Web Site Indexed (cont)
- Several search engines allow URLs to be submitted
Bulk Submissions Turnaround time from a few days
to several months And what about bulk submission
services?
9Some Solutions (4)
Some Submission Engines http//www.webposition.com
/ http//www.netsubmitter.com/ http//www.register
pro.com/ http//www.pegasoweb.com/engenius/ http
//www.exploit.com/wizard/
- There are products for submitting sites to
multiple search engines (and analysing your
pages, reporting on your position in search
engines, etc.) But - How good are they?
- How ethical are they?
- How cost-effective are they?
10Has It Worked?
- How do you know if robots are visiting your web
site? - The free BotWatch Perl program will analyse your
log files and generate a report on visits by
robots.
BotWatch is available at lthttp//www.tardis.ed.ac.
uk/sxw/robots/botwatch.htmlgt See also
lthttp//www.botspot.com/gt
11Problems in Ranking
- Typically large numbers of hits are obtained.
- Metadata may help
- ltmeta name"keywords" content"exploit, web
magazine, TAP, telematics"gt - ltmeta name"description" content"Exploit
Interactive is a .."gt - ltmeta name"DC.Title" content"Exploit .."
- But
- "AltaVista" and Dublin Core metadata are not
supported by all (many?) search engines - Issues about maintenance of metadata
12Some Solutions
- Use of "AltaVista" metadata is a must for key
pages - Use of Dublin Core
- Could be used in specialist applications
(domain-specific search engines, current
awareness services, B2B, etc.) - Think about additional benefits to you (e.g.
local searching, auditing) - Scope for discussions with search engine
vendors? - Need to think about deployment and maintenance
The Exploit Interactive web magazine uses Dublin
Core metadata to enhance local searching. The
metadata can also be used by 3rd parties
13Analysis of NFP Web Sites
- Report of an analysis of NFP (National Focal
Point) web sites published in Exploit Interactive
issue 3. Of the 10 web sites - No significant use of metadata on main entry
point - Six made no use of REP, one disallowed all
robots and three made sensible use - No use of separate domain names
- One framed site
http//www.exploit-lib.org/issue3/nfp-websites/
14Web Directories
- Web directories (e.g. Yahoo!) provide
manually-compiled classifications of the web - Benefits to Projects
- Additional place to be found
- "61 reach in UK Search engine market"
- Can be sensibly classified e.g. Ariadne magazine
is in lthttp//www.yahoo.co.uk/Reference/Libraries/
Professional_Resources/Internet_in_Libraries/gt - Problems
- Time-consuming for cataloguers
- Entries can be submitted, but this can be
time-consuming - "..sub-domains have difficulties in getting into
Yahoo!"Compare - www.ukoln.ac.uk/projects/eu/exploit/www.ukoln.ac.
uk/exploit/www.exploit-lib.org www.ukoln-exploit
.ac.uk
15Submission to Web Directories
- It might be worth submitting to web directories
such as Yahoo! - Remember that the information will be processed
by humans. - See lthttp//www.searchenginewatch.com/webmaster
s/gt
16Give Your Web Site Away
- Another way to promote your web site is to give
it away! - You could give away
- Parts of the site to robots (e.g. metadata)
- Parts of the interface
- The entire site
- You could give away the interface to
- your local indexer
- a remote indexing service e.g. HotBot
- See ltwww.ariadne.ac.uk/issue21/webwatch/gt
Search interface embedded in Exploit Interactive
article at lthttp//www.exploit-lib.org/issue3/nfp-
websites/gt
17Give Part of Your Site Away
- OMNI gives an example of a site hosting remote
search interfaces. - Enhances remote interface, but several issues.
- See article at lthttp//www.ariadne.ac.uk/issue21/
webwatch/gt for discussion
http//www.omni.ac.uk/other-search/
18Give Your Web Site Away
- Why not have your web site mirrored? Mirrors in,
say, USA and Australia will help to promote your
service. - Is your web site easily mirrored?
- Are relative URLs used?
- Do you use directories structures to delineate
areas of your web site? - If you use server-side scripting for management
purposes, do you hide unusual URLs - /issue1/mag-features.asp Problems
- /issue1/mag-features/default.asp
- ? /issue1/mag-features/ Usable on Unix
- (also techniques such as Apache rewrites)
Issues
If your web site can't be mirrored, can it be
preserved?
19Citation
- Is your project web site address easy to
remember? - Issues
- Short domain names are a winner
- Short URLs are desirable (try to avoid org.
structure) - Try to cite directories (shorter and less
ambiguous) - www.exploit.org/issue1/pride/article.htm
(article.html, article.asp) - www.exploit.org/issue1/pride/ pride/default.asp
- Very important for web site home page
- Try to avoid use of tilde ()
- Avoid citing binary files (inaccessible, lack
of metadata, alternative versions, etc.)
"Promoting Web Site" Talk Given on 18 Nov
1999 Slides HTML PowerPoint
20Let's Not Forget Publications
http//www.exploit-lib.org/issue3/
- Getting published in a web magazine (such as
Exploit Interactive) can have many benefits - Visibility to (variety of) readers
- Web magazine may submit its pages to search
services - Links in web magazine may be harvested
- Web magazine may be made available on CD ROM,
free text system, etc.
Magazine articles may also be cited e.g. see
lthttp//sunsite.berkeley.edu/CurrentCites/gt
21Measuring Your Success
- LinkPopularity.com lets you check on the number
of sites linking to your web site
Link popularity is growing in importance as
search engines make use of citation analysis
("this site is best, as there are lots of links
to it" or "this site is linked to by important
sites").
"I tried LinkPopularity.com, pointing out to a
potential advertiser that EEVL had, according to
HotBot, 1099 sites linking to it, whilst there
were only 18 sites linking to their site, and
suggested that what they needed was more
exposure.It seems to have worked, as they have
agreed to buy an ad on the soon to be released
new design EEVL site." Roddy McLeod, EEVL
(posting to lis-elib list)
22Don't Forget Your Stats
- You will produce graphs of your web statistics
for project reports - Do the graphs indicate
- A healthy growth
- Growth in the number of robots
- Growth in the wrong community
- Look beneath the surface
- Think about "enterprise analysis packages"
referer "" Entered directly referer
"www.foo.fr/goodstuff/" Followed link
If you record the referrer field you will be able
to see the links users follow to arrive at your
web site. This may help to inform dissemination
strategies.
23Universal Design
- Many of the guidelines provided will have
additional benefits - Robots and people with disabilities (e.g.blind
users) have similar characteristics i.e. can't
follow images, may not be able to access framed
sites, etc. - Indexing programs may index ALT attributes in
ltIMGgt elements - Sensibly-structured web sites can be more easily
archived and mirrored. - Metadata for general resource discovery can be
reused for other applications (e.g. current
awareness services).
24Conclusions
- To conclude
- There are approaches to the web site
architectural design which can help in promoting
your project web site, including - Project-specific domains Short URLs
- Use of the robots.txt file Metadata
- Accessible web design
- Once you have the correct architecture, you can
assist in the promotion process through various
submission tools - Many of the solutions will have additional
benefits - Ideally the solutions will be implemented at the
start of the project! - Dialogue with your server administrator is
important
25Further Information
Book Reviews lthttp//www.hw.ac.uk/libWWW/irn/irn5
8/irn58d.htmlrecentgt lthttp//www.hw.ac.uk/libWWW/
irn/irn59/irn59d.htmlrecentgt
- Search Engine Watch
- lthttp//www.searchenginewatch.com/gt
- Deadlock
- lthttp//www.deadlock.com/promote/gt
- Did-it
- lthttp//www.did-it.com/gt
- ViirtualPromote
- lthttp//www.virtualpromote.com/promotea.htmlgt
- Pegasoweb
- lthttp//http//www.pegasoweb.com/gt
Yahoo! lthttp//dir.yahoo.com/Computers_and_Inter
net/Internet/World_Wide_Web/Information_and_Doc
umentation/Site_Announcement_and_Promotion/gt
Broadcaster URL submission service lthttp//www.b
roadcaster.co.uk/gt Submit-it URL submission
service lthttp//www.submit-it.com/gt