Title: CLAIMS
1CLAIMS
CLassification Automated InforMation System
Computer-Assisted Categorisation of Patent
Documents in the International Patent
Classification
Patrick Fiévet, CLAIMS Project Manager WIPO
Caspar J. Fall, CLAIMS Consultant ELCA ICIC03,
Nîmes, 22 October 2003
2Agenda
- Introduction to CLAIMS project (PF)
- Computer-assisted categorization prototypes
(CJF) - CLAIMS Categorizer perspectives (PF)
31. Introduction to CLAIMS Project
41.1 CLAIMS Context
World Intellectual Property Organization (WIPO)
International Patent Classification (IPC)
Classification Automated Information System
(CLAIMS)
51.2 CLAIMS Project Objectives
IT support for the promotion of the IPC
- IPC Reform and revision support
- Translation and Natural Language Search in the
IPC
- IPC Categorization assistance to Patent Offices
62. Computer-assisted Categorization
72.1 Objectives
- Develop a solution for predicting International
Patent Classification (IPC) codes - Facilitate accurate classification in small and
medium patent offices - Support for documents in multiple languages
- Categorization assistance tool
- Open questions
- Depth of computer-assisted categorization
- What accuracy?
82.1 Key issues
- Survey of automated categorization research
- Patent categorization
- The IPC is a hierarchical classification
- 120 classes, 628 subclasses, 69000 groups
- Patents have secondary IPC codes
- The categories are modified over time
- Vocabulary very diverse and technical
92.1 Patent categorization approach
- Machine-learning method to recognize categories
- Statistical distribution of words
- Establish training data
- Training documents with good IPC codes
- 210000 to 830000 documents
102.2 Prototype
- Custom development
- State-of-the-art algorithm
- Language independent
- Measure categorization success
- Compare the predictions with other manually
classified documents
112.2 Prototype results
122.2 Improving accuracy with category refining
Scenario 1
Scenario 2
validate
direct
132.3 Conclusions
- It works well!
- Useful user assistance
- Direct categorization at subclass level possible
- IPC codes can be refined accurately to main group
level - To get accurate results, one needs
- Large datasets
- Good category coverage
- Accurate IPC codes
- Read the proceedings for more details
- Demonstration available after the presentation
143. IPCCAT
153.1 CLAIMS Categorizer Perspectives
1. Implementation IPCCAT
2. Training sets for IPC Categorization English,
French, Spanish and Russian, German possibly
chinese
3. IPC Data sets improvement Categorizer
Retraining
163.2 CLAIMS Categorizer Perspectives
4. Improve integration of the IPC Categorizer
with other CLAIMS tools
5. CLAIMS policy for distribution of data sets in
various Languages
173.2 Access to IPCCAT for PCT
Login IBGST01
Password clobterib
18Questions / Answers
Patrick Fiévet patrick.fievet_at_wipo.in
19CLAIMS
CLassification Automated InforMation System
Thank you for your attention