CLAIMS - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

CLAIMS

Description:

3. IPC Data sets improvement & Categorizer Retraining. 15 WIPO 2003 PF & CJF ... Categorizer Perspectives. 4. Improve integration of the IPC Categorizer ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 20
Provided by: caspa4
Category:

less

Transcript and Presenter's Notes

Title: CLAIMS


1
CLAIMS

CLassification Automated InforMation System
Computer-Assisted Categorisation of Patent
Documents in the International Patent
Classification
Patrick Fiévet, CLAIMS Project Manager WIPO
Caspar J. Fall, CLAIMS Consultant ELCA ICIC03,
Nîmes, 22 October 2003
2
Agenda
  1. Introduction to CLAIMS project (PF)
  2. Computer-assisted categorization prototypes
    (CJF)
  3. CLAIMS Categorizer perspectives (PF)

3
1. Introduction to CLAIMS Project
4
1.1 CLAIMS Context
World Intellectual Property Organization (WIPO)
International Patent Classification (IPC)
Classification Automated Information System
(CLAIMS)
5
1.2 CLAIMS Project Objectives
IT support for the promotion of the IPC
  • IPC Reform and revision support
  • IPC Tutorials
  • Translation and Natural Language Search in the
    IPC
  • IPC Categorization assistance to Patent Offices

6
2. Computer-assisted Categorization
7
2.1 Objectives
  • Develop a solution for predicting International
    Patent Classification (IPC) codes
  • Facilitate accurate classification in small and
    medium patent offices
  • Support for documents in multiple languages
  • Categorization assistance tool
  • Open questions
  • Depth of computer-assisted categorization
  • What accuracy?

8
2.1 Key issues
  • Survey of automated categorization research
  • Patent categorization
  • The IPC is a hierarchical classification
  • 120 classes, 628 subclasses, 69000 groups
  • Patents have secondary IPC codes
  • The categories are modified over time
  • Vocabulary very diverse and technical

9
2.1 Patent categorization approach
  • Machine-learning method to recognize categories
  • Statistical distribution of words
  • Establish training data
  • Training documents with good IPC codes
  • 210000 to 830000 documents

10
2.2 Prototype
  • Custom development
  • State-of-the-art algorithm
  • Language independent
  • Measure categorization success
  • Compare the predictions with other manually
    classified documents

11
2.2 Prototype results
12
2.2 Improving accuracy with category refining
Scenario 1
Scenario 2
validate
direct
13
2.3 Conclusions
  • It works well!
  • Useful user assistance
  • Direct categorization at subclass level possible
  • IPC codes can be refined accurately to main group
    level
  • To get accurate results, one needs
  • Large datasets
  • Good category coverage
  • Accurate IPC codes
  • Read the proceedings for more details
  • Demonstration available after the presentation

14
3. IPCCAT
15
3.1 CLAIMS Categorizer Perspectives
1. Implementation IPCCAT
2. Training sets for IPC Categorization English,
French, Spanish and Russian, German possibly
chinese
3. IPC Data sets improvement Categorizer
Retraining
16
3.2 CLAIMS Categorizer Perspectives
4. Improve integration of the IPC Categorizer
with other CLAIMS tools
5. CLAIMS policy for distribution of data sets in
various Languages
17
3.2 Access to IPCCAT for PCT
Login IBGST01
Password clobterib
18
Questions / Answers
Patrick Fiévet patrick.fievet_at_wipo.in
19
CLAIMS
CLassification Automated InforMation System
Thank you for your attention
Write a Comment
User Comments (0)
About PowerShow.com