Title: Representing a Computer Science Research Organization on the ACM Computing Classification System
1Representing a Computer Science Research
Organization on the ACM Computing Classification
System
Boris Mirkin School of Computer Science and
Information Systems , Birkbeck College,
University of London, United Kingdom Susana
Nascimento and Luís Moniz Pereira Computer
Science Department and Centre for Artificial
Intelligence (CENTRIA) Faculdade de Ciências e
Tecnologia Universidade Nova de Lisboa Portugal
2Motivation an Objective Portrayal of Research
Organisation as a Whole
- Overview the structure of scientific subjects
being developed in the organisation. - Position the organisation over the ACM-CCS
ontology. - Assessing scientific subjects not fitting well to
ACM-CCS - these are potentially the growth points or other
breaking through developments. - Planning research restructuring and investment.
- Overview of scientific field being developed in a
country, with a quantitative assessment of
controversial areas - e.g. the level of activity is not sufficient or
the level of activities excesses the level of
results.
3ACM-CCS Classification 1998 - level 1
- G. Mathematics of Computing
- H. Information Systems
- I. Computing Methodologies
- J. Computer Applications
- K. Computing Milieux
- A. General Literature
- B. Hardware
- C. Comp. Sys. Organization
- D. Software
- E. Data
- F. Theory of Computation
4Cluster-Lift Method
- Express Research Activities of CS Organization
(RAO) as a set of CLUSTERS of ACM-CCS Subjects - Captures RAO in a straightforward way
- No information away about individual members or
teams - Can be implemented on different levels of the
taxonomy - Needs good clustering tecniques
- MAP individual clusters to ACM-CCS and GENERALISE
them - A new approach
- Extendable to other ontologies and activities
5Electronic Survey Tool for Data Collection
6Generic Survey Output fuzzy memberships over
all subjects in 3rd Layer of ACM-CCS
7Fuzzy Similarity between ACM-CCS Subjects
- Contribution by a respondent
- f(i) membership vector over all subjects i in
3rd layer of ACM-CCS from the survey. -
- A(i,j)f(i)?f(j), the product, for all ACM-CCS
3rd layer subjects i and j. - Matrices A(i,j) summed up over all individuals
weighted according to their span ranges. - Fuzzy similarity measure between two ACM-CCS
subjects - measure is proportional to the number and
importance of research activitives in both
subjects (details can be presented).
8Bulding Overlapping Subject Clusters
- Additive Clustering with Iterative Extraction
(ADDI-S) - Given the similarity matrix, the additive
clustering problem is of finding one-by-one of K
clusters and their intensity weights that
minimize the sum of squared errors. - Interpretable parameters of cluster intensity and
its contribution to the explanation of the data
scatter. - Leads to tight clusters
- A subject i belongs to a cluster S in case its
similarity is higher than half of the average
similarity within the cluster S - Subject i is also well separated from the rest,
because for each entity j ? S, its average
similarity with S is less than that. - Computationally feasible.
9Generalising Subject Clusters mapped onto
ACM-CCS good and bad cases
- Blue cluster is tight, all topics are in one
ACM-CCS subject. - Red cluster is dispersed over many ACM-CCS
subjects.
10Lifting a Subject Cluster onto the Ontology
- Elementary Structures
- The set of subject clusters, their head
subjects, gaps and offshoots constitutes
what can be called the profile of the
organization under study. - The total count of head subjects, gaps, and
offshoots, each type weighted accordingly, can
be used for scoring the extent of the fit between
a research grouping and the ontology.
11Parsimonious Lifting of Subject Cluster onto
ACM-CCS
- Plural Solutions which one is better?
- Mapping (B) is better than (A) if gaps are much
cheaper than additional head subjects.
12Real Case Study 2006 Survey of CS of
FCT-Universidade Nova de Lisboa
- Survey conducted in our Department in 2006
- Participation 30 individuals
- Each one supplied three ACM-CCS 2nd level topics
- 26 of 59 topics at ACM-CCS 2nd level are covered
- Additive clustering algorithm ADDI-S
- Six subject clusters found
- cl1 F1, F3, F4, D3 (contribution 27.08)
- cl2 C2, D1, D2, D3, D4, F3, F4, H2, H3, H5, I2,
I6 (contribution 17.34) - cl3 C2, C3, C4 (contribution 5.13)
- cl4 F4, G1, H2, I2, I3, I4, I5, I6, I7
(contribution 4.42) - cl5 E1, F2, H2, H3, H4 (contribution 4.03)
- cl6 C4, D1, D2, D4, K6 (contribution 4.00)
13Profile of DI-FCT-UNL (2006 Survey)
14Analysis
- The most contributing cluster with head subject (
) Theory of Computation comprises a very
tight group - The next contributing cluster has two head
subjects ( ) D. Software and H. Information
Systems, and several offshoots among the other
head subjects, indicating that this cluster
should be the structure underlying a certain
unity of the department - There are only 3 offshoots outside the
departments head subjects. - E1. Data Structures from H. Information
Systems - G1. Numerical Analysis from I. Computing
Methodologies - K6. Management of Computing and Information
Systems from D. Software - as all them seem natural, they potentially could
be updated in the list of collateral links of the
ACM ontology.