Hot News - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Hot News

Description:

Title: Slide 1 Author: FANTOM Last modified by: user Created Date: 8/16/2006 12:00:00 AM Document presentation format: On-screen Show (4:3) Other titles – PowerPoint PPT presentation

Number of Views:212
Avg rating:3.0/5.0
Slides: 35
Provided by: fan65
Category:

less

Transcript and Presenter's Notes

Title: Hot News


1
Hot News
  • Reporter
  • Hossein Kamyar
  • Asef poormasoomi

Supervisor Dr. Mohsen Kahani
2
Tehran University
  • Database Research Group
  • Natural Language and Text Processing Group

3
Database Research Grouphttp//ece.ut.ac.ir/dbrg
  • Members
  • Faculty Staff 8
  • Students 9
  • Alumni 17

Dr.Caro Lucas Dr.Behzad Moshiri Dr. Rohani
Rankouhi
4
Database Research Group
  • Research Project
  • Modernization Of Systems
  • Information Retrieval
  • Data Mining
  • Data Management

Project Title Supervisor
Question Answering with Human Plausible Reasoning Dr. Farhad Oroumchian
Improving xml Information Retrieval By means of Human Plausible Reasoning Dr. Farhad Oroumchian
Question Answering with Dynamic Functions and Plausible Inferences Dr. Farhad Oroumchian
Distributed Information Retrieval on The Web Dr. Farhad Oroumchian
Concept-based searching in a semantic web environment Dr. Farhad Oroumchian
5
Database Research Group
  • Research Project
  • Modernization Of Systems
  • Information Retrieval
  • Data Mining
  • Data Management

Project Title Supervisor
XML Data Mining Dr. Masoud Rahgozar
Mining for conceptual associations in unstructured and semi-structured text for reasoning with Human Plausible Reasoning. Dr. Masoud Rahgozar
Spatial Data Mining and its Application in bank business Intelligece Dr. Masoud Rahgozar
XML Mining by frequent tree patterns Dr. Masoud Rahgozar
Bioinformatic Database Integration Using Data Fusion Approach Dr. Behzad Moshiri
6
Database Research Group
  • Research Project
  • Modernization Of Systems
  • Information Retrieval
  • Data Mining
  • Data Management

Project Title Supervisor
An Efficient Framework for XML Data Management Dr. Masoud Rahgozar
XML Query Processing and Optimization Dr. Fatemi
7
Database Research Group
Industrial Project
  • Industrial Project

Industrial Project
Project Title Organization
Iranian Welfare and Social Security Database Analysis Ministry Of Welfare and Social Security
MAVA-Vista Advanced Digital Library System ICT Department of MUT
Business Intelligence System Bank Mellat
Geographical Information System Statistics Center of Iran
Chizar Digital Archive Management and Planning Organization
8
Database Research Group
  • Related Course
  • 1. Introduction to Database Systems
  • 2. Advanced Database Systems
  • 3. Special Topics in Database Systems
  • 4. Database Laboratory
  • 5. Data Mining
  • 6. Information Retrieval
  • 7. Natural Language Processing

9
Database Research Group
  • Persian Corpus
  • Hamshahri Corpus
  • ???? 1 ???? ?????? ?????? ???? ?????????????
    CLEF ??????? ? ????? ??????. ??? ?????? ??
    CLEF2008 ? CLEF2009 ??????? ??? ??? ? 100
    ???????? ????.
  • ???? 2 ?????? ?????? ?? ??? 1388 ???? ??????
    UTIRE ?? ???? ???????? ?????? ???? ??????? ?????
    ? ?? ???? ????????? TREC ???? ??? ???.

Criteria Criteria Version 1 Version 2
Size (Unicode CLEF XML Format) Size (Unicode CLEF XML Format) 700 MB 1400 MB
Number of Documents Number of Documents 160,000 318,000
Documents Time Span From 1996/4/23 1996/4/23
Documents Time Span To 2003/2/11 2007/5/13
Documents Category Documents Category Yes Yes
Link to Images Link to Images No Yes
Link to Original Webpages Link to Original Webpages No Yes
Query Relevance Judgments Query Relevance Judgments Yes Yes
10
Database Research Group
  • Persian Corpus
  • Bijankhan Corpus

Bijankhan corpus is a tagged corpus that is
suitable for natural language processing research
on the Persian (Farsi) language. This collection
is gathered form daily news and common texts. In
this collection all documents are categorized
into different subjects such as political,
cultural and so on. Totally, there are 4300
different subjects. The Bijankhan collection
contains about 2.6 millions manually tagged words
with a tag set that contains 40 Persian POS tags.
11
Database Research Group
  • Persian Corpus
  • ?????? ??? ?? dotIR
  • ??? ?????? ???? ?? ??? ?? ?? ???? .ir ???? ??
    ?????? ??? ????? ??. ??? ?? ??????? ?? ?????????
    ?????? UTIRE ????? 50 ???????? ???? 25 ?????
    ????? ????. ??? ??????????? ???? ?????? ??????
    ???? ??????? ???? ?????? ? ????? ??????? ????
    ???? ????? 18424 ??? (???? ????? 369 ??? ???? ??
    ????????)? ???? ???? 25 ????? ???? ????? ????
    ??????. ???? ????? ????? ????? ?? ?? ????????
    ???? ?????.
  • ?????? ???? ????? ? ?????? ???????????? ?????????
    ?? ??????? ????? ????? 56 ????? ?? ????? ???????
    ??? ???? ?? ???????? ?? ???? ????????? LETOR
    (????? ??? ???? Microsoft Research Asia ) ???????
    ????. ?????? ????? ????????? ?? ???????? ?????
    ?????? ?????? ???? ?????? ???????????? ????????
    ??? ???? ????????? ? ?? ????? ? ????? ???????????
    ??? ?????.
  • ??? ????? ???? ???? ??????? ??????? ????? ?
    ????????? ?????? ???? ??????? ????? ???????? ???
    ???.

12
Natural Language and Text Processing Group
  • Members
  • 10 members

Heshaam Faili Assistant Professor, Ph.D.
Artificial Intelligence from Sharif University of
Technology
13
Natural Language and Text Processing Group
Project Title
English-Persian Statistical Machine Translation
English-Persian Rule-based Machine Translation
Statistical Word Sense Disambiguation
Automatic Persian WordNet Construction
Parallel and Comparable Corpus Construction
Monolingual Corpus Construction
Spell, Grammatical and Real-word Error Detection and Correction
Grammar Induction
Semantic Role Labeling
Statistical Parsing
Text Classification using Neural Networks
  • Research Project
  • More Than 23 Papers ?

14
Natural Language and Text Processing Group
Industrial Project
  • Industrial Project

Industrial Project
Project Title Organization
Vafa Spell Checker Software and Information Technology Group, Information Technology Research Center, Iran Telecommunication Research Center.
  • ????? ? ????? ?????? ?????? ?????? ? ??????
  • ?????? ??? ?? ??? ???????? ?????? word
  • ?????? ??????? ? ?????? ?????? ?? ???? ??????
  • ???? ? ??????
  • ??????

15
Natural Language and Text Processing Group
  • Persian Corpus
  • 1. TEP Tehran English-Persian Parallel Corpus
  • First free Eng-Per corpus
  • 4-million tokens on each side
  • Sentence Aligned
  • 2. TMC Tehran Monolingual Corpus
  • Largest freely available monolingual corpus for
    Persian language
  • Tokenized
  • Suitable for Language Modeling
  • 3. Mutual Information
  • http//ece.ut.ac.ir/nlp/resources.html

16
Natural Language and Text Processing Group
  • Related Course
  • Introduction to Natural Language Processing, Dr.
    Heshaam Faili Advanced Database Systems

17
Beheshti University
shahid
  • Natural Language Processing research laboratory
    was founded by Dr. Mehrnoush Shamsfard at the
    beginning of 2006 in computer engineering
    department of Shahid Beheshti University
  • More Than 25 members.
  • More Than 92 papers.
  • http//nlp.sbu.ac.ir/

18
Research Project
  • A. Developing Linguistic resources
  • Developing Semantic annotated corpus
  • Developing chunked corpus
  • Developing parallel corpus
  • Developing Persian Verbs database
  • Semi-automatic Lexicon Acquisition 
  • Start 2006
  • Researchers Maliheh Monshizadeh, Elham Fekri

19
Research Project
  • B. Fundamental Persian text processing tools
  • Standard Text Preparation for Persian
  • Stemmer /Morphological analyzer / lemmatizer
  • Tokenizer
  • POS Tagger
  • Spell checker
  • chunker
  • Syntax parser
  • Persian Named Entity Recognition - SBUNER
  • Persian Anaphora resolution
  • Semantic Role Labelling
  • Start 2006
  • Researchers Samira Noferesti, Rana Forsati,
    Pooneh Mortazavi, Hoda Sadat Jafari

20
Research Project
  • C. NLP Applications
  • Machine translation PenTrans project   
  • English to Persian Translation System
  • Persian to English Translation System
  • Machine translation evaluation toolkit
  • Persian Text summarization PARSUMIST   
  • Question Answering   
  • Persian
  • English SBUQA
  • Information Extraction - Mersad   
  • Text understanding   
  • Conversion between Persian sentences and first
    order logic
  • Text generation
  • Start 2006
  • Researchers Chakaveh Saedi, Yasaman Motazedi,
    Mostafa Nazari

21
Research Project
  • D. Ontology engineering
  • Ontology development   
  • Development of CMMI-ACQ ontology
  • Collaborative development of ontology of computer
    science and engineering (COMON)
  • Fuzzy ontologies
  • Ontology Learning
  • Ontology learning from text
  • Ontology learning from web
  • Relation extraction
  • Ontology mapping   
  • Evolutionary ontology matching
  • A linguistic-Structural Approach to Bilingual
    Ontology Mapping
  • Ontology population and instantiation
  • Start 2006
  • Researchers Aynaz Taheri, Hakimeh Fadaei, Tara
    akhavan, Rahim Dehkharghani, Valeh Montaghami,
    Bahareh Sarrafzadeh, Amir Sharifloo, Rana Forsati

22
Research Project
  • E. Semantic Web
  • Semantic Annotation of documents    
  • Converting web documents into semantic web
    resources   
  • Semantic search   
  • Semantic web service discovery and composition
  • Start 2006
  • Researchers Bahareh Sarrafzadeh, Hoda Mirzaie,
    Maryam Haghollahi, Homan Farrokhzad

23
Research Project
  • F. Hybrids
  • Application of fuzzy ontologies in qualitative
    reasoning    
  • E-learning   
  • Ontology based Content Rearrangement for
    Intelligent Tutoring Systems   OCRITS
  • Intelligent Content Management
  • Start 2006
  • Researchers Hamzeh Motahari, Marzieh Shariati

24
Courseware
  • Ontology Engineering
  • Natural Language Processing
  • Semantic Web
  • Advanced Natural Language Processing, Fall 2005
    BY
  • Regina Barzilay and Michael Collins

Columbia University
MIT University
25
Tools
  • FarsNet The first Persian WordNet 
  • STeP-1  Standard Text Preparation for Persian
  • Tokenizer
  • Stemmer
  • POS tagger
  • Spell checker

26
S
harif University
  • Natural Language Processing
  • Web Intelligence Laboratory

27
Natural Language Processing
  • Dr ghasem Sani
  • Dr hesham Faili
  • Since 2003 after three inactivity
  • Eliza
  • POS Tagger
  • Unsupervised Natural Grammar Induction

28
Web Intelligence Laboratory
  • Supervisor
  • Dr Abolhasani
  • with 28 members

29
Web Intelligence Laboratory
  • Advanced Researches
  • Semantic Search Engines
  • Semantic Web Services
  • Semantic web for pervasive computing
  • Annotation
  • Semantic Grids
  • Social Networks Analysis
  • Ontology Alignment and Learning
  • Web Clustering
  • Business Intelligence

30
Web Intelligence Laboratory
  • New Researches
  • Composite Web Service Execution Framework.
  • Tracking news to find hot topics.
  • Semantic Programming.
  • Trust model in Semantic Web.
  • New models for recommender systems.
  • Using web to create a lecture for a subject.
  • A Farsi framework for Information Retrieval.
  • A semantic based framework for business
    intelligence applications.

31
S
cience Technology University
  • Unknown Laboratory
  • but Online POS Tagger
  • ?? ?????? ????? ? ???? ??? ??????? ????? ????
    ????? ?????
  • http//persianp.ir/index.php?optioncom_wrappervi
    ewwrapperItemid7
  • http//www.prosody.ir

32
Conferences
  • The Cross-Language Evaluation Forum (CLEF)
  • developing an infrastructure for the testing,
    tuning and evaluation of information retrieval
    systems operating on European languages in both
    monolingual and cross-language contexts
  • (ii) creating test-suites of reusable data which
    can be employed by system developers for
    benchmarking purposes.
  • CLEF Conferences be held since 2000
  • CLEF2011 will be held by Amsterdam University
  • Computational Approaches to Arabic Script-based
    Languages (CAASL)
  • CAASL2011 will be held in Geneva

33
Corporation
  • ??? ???? ?????
  • ??????? ??????? ????? n-gram ???? ???? ?????
  • ??????? ????? ???? ?????
  • ???? ?????? ?????? ???? ?????
  • ??????? ????? ???????? ???? ????? ?? ????? ??????
  • ????? ??? ?? ??? ?????
  • ??? ??????? ????? ???? ??????? ??????? ?
    ??????????? ???? ???????? ????? ? ???????
  • ?????? ?????? GPSG ???? ???? ?????
  • ????? ???? ???????
  • ???????? ????? ??? ?????
  • ?????? ???? ???? ?????

34
w
e do ...
Write a Comment
User Comments (0)
About PowerShow.com