mcSearch: Scientific Article Search - PowerPoint PPT Presentation

1 / 5
About This Presentation
Title:

mcSearch: Scientific Article Search

Description:

NetBeans IDE 4.1. Has built-in Tomcat, and allows for an easy web-development. Lucene ... Trash: those that don't have title/abstract detected. That's all ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 6
Provided by: DIT96
Category:

less

Transcript and Presenter's Notes

Title: mcSearch: Scientific Article Search


1
mcSearch Scientific Article Search
  • Maxym Mykhalchuk
  • Information Retrieval and Search Engines

2
Prequisites
  • NetBeans IDE 4.1
  • Has built-in Tomcat, and allows for an easy
    web-development
  • Lucene
  • Demo code runs (almost) perfectly
  • Test data
  • Text
  • Lots of trash

3
What was done (1) Indexer
  • Some modifications to extract a title and an
    abstract
  • Title
  • everything before Abstract
  • Abstract
  • Everything between Abstract and Introduction

4
What was done (2) Search
  • Search design was ripped off
  • Yandex search engine http//ya.ru
  • Filter was created
  • Allow/disallow trash files search
  • Trash those that dont have title/abstract
    detected

5
Thats all
  • Creating a search engine with Lucene is easy
  • Updating the index is slow
  • One has first to delete the document
  • Then add it again
  • Not nice for the fast-evolving Web
Write a Comment
User Comments (0)
About PowerShow.com