Parity Computing, Inc. - PowerPoint PPT Presentation

1 / 3
About This Presentation
Title:

Parity Computing, Inc.

Description:

Parity Computing takes a data integration view. Collect related semantic entities together ... Parity's automated solution. Merge occurrences into profiles ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 4
Provided by: cro47
Category:
Tags: computing | inc | parity | rosin

less

Transcript and Presenter's Notes

Title: Parity Computing, Inc.


1
Parity Computing, Inc.
  • Chris Rosin, President crosin_at_paritycomputing.com
  • Mark Land, VP m.land_at_paritycomputing.com
  • Mohan Paturi, Chairman paturi_at_paritycomputing.com
  • Headquarters in San Diego, CA
  • Providing intelligent automation for large-scale
    collections of text and metadata
  • Clients are primarily scholarly publishers
  • Commercial software and services
  • Duplicate detection, consensus record creation
  • Author merging and disambiguation
  • Affiliation parsing and institution
    standardization
  • Reference extraction and linking

2
Identifying Authors
  • Parity Computing takes a data integration view
  • Collect related semantic entities together
  • For authors author merging and disambiguation
  • Manual versus automation
  • A very detailed task
  • For each occurrence of each author name, examine
    evidence of author identity
  • Unless data is very small, automation will be a
    requirement
  • Paritys automated solution
  • Merge occurrences into profiles
  • Use available context in the input data as
    evidence for merging
  • Bias only merge with sufficient evidence to be
    confident of identity
  • What can automation achieve
  • Very high accuracy, based on the available
    evidence in the input data
  • But it is not by itself a complete solution
  • Some occurrences lack evidence to be confidently
    merged

3
Combination of Approaches for Author
Identification
  • Automatically-generated profiles with unique ID
  • Based on input metadata
  • As a baseline for further corrections and
    improvements
  • Further opportunities for automation
  • Ongoing automation
  • After initial profiles are generated and
    corrections start to be applied, continue
    automation for new input data
  • Possibility of automating corrections based on
    author CV
  • Important aspects of the overall system in which
    this is deployed
  • Possibility of additional externally-supplied
    author detail full name, homepage...
  • Sensitive to author participation and level of
    effort
  • User interface
  • Authentication and trust
  • Privacy
Write a Comment
User Comments (0)
About PowerShow.com