A united Platform for NMR: CCPN - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

A united Platform for NMR: CCPN

Description:

Developers at University of Cambridge and EBI in UK. Unifying ... Alan da Silva. EBI, Hinxton. Kim Henrick. John Ionides. Wim ... Alex Grishaev ... – PowerPoint PPT presentation

Number of Views:376
Avg rating:3.0/5.0
Slides: 59
Provided by: timst7
Category:
Tags: ccpn | nmr | platform | united

less

Transcript and Presenter's Notes

Title: A united Platform for NMR: CCPN


1
A united Platform for NMR CCPN
  • Tim Stevens
  • March 2007

2
The CCPN Project
  • Collaborative Computing Project for NMR
  • Started in 1999
  • Collaborators in several countries
  • Developers at University of Cambridge and EBI in
    UK
  • Unifying platform for NMR software
  • Similar to CCP4 (X-ray)
  • Main goals
  • Data standards and data exchange
  • Software integration
  • Software development and distribution
  • Meetings to determine and disseminate best
    practice
  • Open source access

3
CCPN Provides
  • The CCPN data model standard and libraries
  • UML model
  • APIs in multiple languages
  • Data exchange with existing software
  • Format conversion via data model
  • Deposition into NMRStar 2 3
  • New NMR applications
  • FormatConverter
  • Analysis
  • Clouds
  • Processing
  • A platform for community development
  • ARIA 2.1, QUEEN, Recoord
  • Open source access to software
  • Yearly conference (UK)

4
People
  • Cambridge
  • Ernest Laue
  • Wayne Boucher
  • Rasmus Fogh
  • Tim Stevens
  • Dan ODonovan
  • Wolfgang Rieping
  • Alan da Silva
  • EBI, Hinxton
  • Kim Henrick
  • John Ionides
  • Wim Vranken
  • Anne Pajon

5
Traditionally Anarchic and Lossy Data Links
Task2
Task1
Convert
Task1
Task1
Convert
Convert
Convert
Task2
Convert
Task 3
Task3
Task3
6
A Unified NMR Data standard
7
NMR Software
  • Problem - Heterogeneous development
  • Lots of proprietary data formats
  • Lots of stand-alone programs
  • Data is lost along the way
  • Dedicated converters needed
  • Not ideal for structural genomics projects
  • Solution - Unity
  • Data standards
  • Ease of transfer between programs
  • Completeness, integrity, deposition, data mining
  • Libraries

8
Structural Biology Pipeline
Sample Preparation
NMR Machine
Structure Calculation
Data Processing
Spectrum Analysis
Repository Database
9
CCPN Approach
  • Data model rather than data format
  • Format independent
  • Language independent
  • Scientifically descriptive (NMR)
  • Library (API) in memory manipulation
  • Create, update, delete query objects
  • One for each language
  • Error checking
  • I/O modules load/store data from/to disk
  • One for each (storage format, language)
  • Bookkeeping

10
Data Models
Chain Code
  • Abstract description of data
  • Independent of any file format
  • An interconnected collection of objects (classes)
  • Describes organisational hierarchy
  • Chain Residue Atom
  • Describes inheritance (subclasses)
  • e.g. Distance or dihedral constraint
  • Describes attributes
  • Describes rules of connectivity
  • e.g. A bond must have exactly two atoms

Residue 3 letter code Seq number
Atom Name Element
Coordinate X Y Z
11
Coordinates
12
UML Example
13
Application View
User
GUI
Application1
Application2
Application3
API
In Memory Representation (Python, Java, C, Perl)
I/O
Data Store (XML, SQL)
14
Data Model Contents
Applications
Citations
NMR
Samples
Nuclei and
Structure
Isotopes
Targets
Molecule
Compound
Compound
Sequence
Source
Preparation
Molecular
Project
System
Tracking
X-ray
Crystallisation
Crystallography
15
CCPN Packages
  • Groupings of related data
  • e.g. NMR, X-ray, Molecular description
  • Connections between packages
  • e.g. NMR loads Nucleus (isotope) information
  • Allows lazy loading
  • Only load relevant data
  • Only load when a link is queried
  • Save only modified
  • Reference packages
  • Chemical compound, Reference chemical shifts

Molecule
ChemComp
People
MolSystem
Nucleus
Sample
Coordinates
Nmr
16
CCPN API
  • Classes for developers
  • Mainly getters and setters
  • More than just code stubs
  • Constraints (e.g. cardinality) enforced
  • Links the hard part
  • Mostly (gt 99) auto generated from UML
  • Some helper functions and constraints hand coded
  • Currently around 360k lines in Python and 650k
    lines in Java

17
Code Generation Overview
18
Python API
  • Find the number of assigned peaks in a spectrum
  • count 0
  • for peakList in spectrum.peakLists
  • for peak in peakList.peaks
  • for peakDim in peak.peakDims
  • if peakDim.peakDimContribs
  • count 1
  • break
  • PINE peak list output
  • For peak in peakList...

19
New Core API technology
  • Reduce burden of adding new languages, formats
  • Languages (Python, Java, C, Perl)
  • Storage formats (XML, SQL)

Most of the logic
Language Format independent
Language dependent only
Format dependent only
Language Format dependent
Code required for new language
Code required for new format
20
CcpNmr Applications
21
Main CcpNmr Applications
  • Format Converter
  • Conversion to and from legacy formats
  • Analysis
  • Graphical analysis (e.g. assignment) program
  • Processing (coming soon)
  • Azara process wrapped in data model

22
NMR Applications
CcpNmr Processing
Validation (Queen)
CcpNmr Analysis
ISD ARIA 2.1
CCPN Data Model
Reference Data Isotopes Nuclei Chemical
Compounds Chemical Shifts Experiment Prototypes
CcpNmr FormatConverter
NMRStar 3.0
Multiple Legacy Formats
23
Licenses
  • GPL
  • Data model
  • Scripts which produce APIs
  • LGPL
  • Generic libraries
  • Widget libraries
  • Format Converter
  • CCPN
  • Analysis

24
CcpNmr Format Converter
  • Import/export of data formats to the Data Model
  • For harvesting/deposition purposes
  • Allow people to use or try out the data model
  • Interaction with existing programs
  • Fully or partially handles
  • Ansig, Auremol, Autoassign, Azara, Bruker,
    Charmm, CNS/XPLOR/ARIA, Concoord,
    Diana/Dyana/Cyana, Discover, Fasta, Felix, MARS,
    Module, .mol, mol2, Molmol, Monte, NmrDraw,
    NMRPipe, NMR-STAR (v2.1.1, v3.0), NmrView, PDB,
    Pipp, Pistachio, Pronto, Sparky, Talos, Varian,
    XEasy
  • Sequences, chemical compounds, coordinates, NMR
    measurements, constraints and peak lists,
    processing and acquisition parameters.

25
Format Converter - The NMR Translator
Peaks
Chemical shifts
Acquisition parameters
XEasy
NmrView
XEasy
NmrView
Bruker
Varian
...
...
Format specific readers
Generic peak converter
Generic chemical shift converter
Generic acquisition parameters converter
Data model entry
CCPN Data Model
Format specific writers
XEasy
XEasy
NmrView
NMRPipe
Azara
...
...
NmrView
Chemical shifts
Peaks
Processing parameters
26
CCPNGrid
  • UK e-Science pilot project

27
CcpNmr Analysis
  • NMR Assignment Program
  • Replace ANSIG and Sparky
  • Demonstrates CCPN approach
  • Modern interface and scripting
  • Scalable and extensible
  • Operating Systems
  • Linux, Sun, SGI, OSX, Windows

28
CcpNmr Analysis Development
  • Developers
  • Wayne
  • C code
  • Spectrum display
  • Myself
  • User interface
  • Assignment
  • Data Analysis
  • Dan
  • Windows version
  • Languages
  • Python
  • Data model interaction
  • Tk Graphical interface
  • Scripting
  • C
  • OpenGL/Tk contours
  • Structure display
  • Mathematical operations

29
CcpNmr Analysis
30
CcpNmr Analysis Highlights
  • Read many spectrum formats
  • Felix, Azara, NMRpipe, NMRView, UCSF, Bruker
  • Multiple, superimposable N dimensional spectra
  • 2,3,4,5
  • Sampled dimensiond (pseudo 3D)
  • Resonance based data model assignment
  • Comprehensive molecular descriptions
  • Complexes, conjugates, unusual residues, organic
    compounds etc
  • Streamlined assignment productivity
  • Link related peak lists quickly, peak matching
    for backbone etc
  • Structure generation interface
  • Create restraints for ARIA (direct link with
    2.1), CYANA, HADDOCK etc
  • Inbuilt structure viewer
  • Violation analysis with direct peak links
  • Data analyses
  • Relaxation rates, HetNOE, chemical shift changes
  • Python macro access to everything

31
New CcpNmr Analysis Features
  • Windows version!
  • Major upgrade for new data model
  • Hopefully fairly transparent, except for macro
    writers!
  • Precalculated contour files
  • Movable peak annotations
  • Pseudo-3Ds from AZARA (sampled data dims)
  • Synthetic peak lists
  • NOEs from structure, HSQC from correlated shifts
    etc...
  • Carbohydrate functionality
  • Separate base and sugar atom assignment options
  • Automatically link sugar resonances to ring atoms
    (not yet released)
  • Follow chemical shift changes
  • Titrations, temperature etc
  • Network anchored constraints
  • Improvements for solid state
  • Horizontal strips, experiment prototypes etc

32
Movable Peak Annotation Labels
33
Synthetic Peak Lists
34
Following Chemical Shift Changes
35
Resonances and Assignment
  • Resonances
  • The centre of the NMR data model
  • Connect to peaks
  • Different peaks may be caused by the same thing.
  • Connect to atoms
  • A connection to NMR equivalent atoms. Need not be
    set if anonymous.
  • Have chemical shifts
  • May have different shifts under different
    conditions.

Experiment Spectra Conditions
Constraint Distance Dihedral
Measurement Chemical Shift Relaxation Coupling
Peak Dimensions
Resonance
Annotation Spin System Connectivity Residue Type
Structure Co-ordinates
Molecule Atoms Residues Chains
36
Constraints HADDOCK Style
37
Network Anchoring
Hy1
Hx1
NO support
Hx1
,Hy1
Hx1
,Hy2
Hy1
Hy3
,Hy3
Hy2
Ha
Hx2
,Hy1
Hy3
Hy3
Hx2
Hx2
,Hy2
Hx2
,Hy3
Hx1
Hx2
Hb
Ha and Hb bridge Network SUPPORT
38
Network Anchoring
Dense initial network
39
Network anchoring reliability
Ambiguous too!
40
Future CcpNmr Analysis Features
  • RDCs and couplings
  • Now easier using new Peak model (no more
    SubPeaks)
  • COSY peak picker
  • Full structure ensemble support
  • Isotope labelling schemes (with FMP Berlin)
  • Isotopomer templates
  • Editable molecule labelling
  • Edit molecular sequence with assignments
  • More fitting functions
  • GUI profiles
  • Used by multiple projects
  • Store colour preferences etc

41
  • CcpNmr Nexus Progress

42
The CLOUDS Protocol
  • Assignment-free structure determination
  • Per Kraulis, Thérèse Malliavin, Irvin D. Kuntz
  • CLOUDS by Miguel Llinas, Alex Grishaev
  • Spatial distribution of anonymous resonances
    generated with NOEs

H
H
H
H
A network of distance constraints between
anonymous atoms is sufficient to generate a low
resolution protein structure.
43
Assignment from Family of Clouds
Yeast Hho1p GI Linker Histone H1 globular domain
I J.O.Thomas, K.Stott
A family of five Clouds
44
Original CLOUDS problems
  • Too few restraints for most proteins
  • Protocols needed clean 2D spectra, no 3D
    protocols
  • Poor resonance disambiguation
  • Distance calibration only for 2D NOESY

Example Yeast Hho1p GI Linker Histone H1
globular domain I J.O.Thomas, K.Stott (92
residues)
An ideal family of Clouds
Backbone of ideal Clouds
Backbone of REAL Clouds Globally inconsistent
45
Beyond CLOUDS CcpNmr Nexus
  • Entirely new code!
  • No FORTRAN
  • Dedicated 3D/4D protocols
  • Use any existing assignments
  • Use backbone experiments, where available
  • Better peak to resonance linking Network
    anchoring
  • Incorporation of covalent information
  • Iterative cycle

46
The CcpNmr Nexus protocol
Experiments used 15N HSQC-NOESY, 13C
HSQC-NOESY, HNcoCACB, HNCACB, HNCA, HNcoCA
NOESY Peaks
Network Anchoring
Restraints
1H Structure
Assignments
Threader
Xplor NIH
Violation Analysis
Covalent Connections
47
Iterative improvement
A
B
C
48
Improved Network Anchoring
Hy1
Hy2
Hy3
Hx1
Hx2
Ha
Hy3
Hx2
Hb
Ha and Hb bridge Network SUPPORT
49
Nexus Threader
  • Fully automated
  • Accepts any initial typing and assignment
    information
  • Optimised assignments from clouds using dynamic
    programming and Monte-Carlo sampling
  • Works with any backbone triple-resonance
    experiments
  • Works with backbone and side chains
  • Only add well-fitting backbone, then generate a
    new, better structure

Globally inconsistent structure Local
connectivity still threadable!
50
CcpNmr Threader
  • Assign spin systems to residues
  • Score chemical shift match for residue type
  • Score backbone experiment peak matches
  • Score inter-atomic distances
  • Hn, Ha Hb distances
  • i-3 to i3

51
Growing Residue Templates
HN
HN
  • Residue fragments
  • ANY
  • ALF
  • BET
  • BBR
  • GAM
  • DEL
  • PHY
  • IVT
  • ASX
  • GLX
  • Fit from 1H structure
  • Xplor topology files
  • Grow through iterations
  • ANY-gtALF-gtBBR-gtVAL
  • Etc

C
CHCH2CH2CH2
OC
OC
HN
HN
CH
CHCH2CH2CO
OC
OC
H
H
HN
HN
CHCH2
CHCH2
OC
OC
H
H
HN
HN
CHCH2CH2
CHCHCH3
OC
OC
52
Iterative improvement
A
B
C
  • Network anchoring
  • Improves with bonds!
  • Generate structure
  • Thread assign
  • Grow new bonds
  • Repeat

53
What can actually be done
  • 92 residue test protein
  • Helix, Sheet and disordered tail
  • 1135 15N NOESY peaks
  • 273113C NOESY peaks
  • First structure protons only
  • Ensemble of 10
  • Generated in 15 minutes
  • Correctly fit all 86 amide protons
  • Second structure backbone linked
  • Ensemble of 10
  • Generated in 38 minutes
  • Correctly fit 33/85 alpha protons

54
Development within the CCPN framework
55
CCPN Interface Schemes
Via FormatConverter
Application
Proprietary Memory
Formatted File
Format Converter
CCPN XML/SQL
In-memory conversion
Custom conversion
Application
Proprietary Model
CCPN Data Model
CCPN XML/SQL
Direct API access
Application
CCPN XML/SQL
CCPN Data Model CcpNmr Functions
56
CcpNmr Analysis Function Library
  • Assignment
  • Constraints
  • Data Analysis
  • Shift differences
  • Hetero NOE
  • Relaxation rates
  • Experiments Spectra
  • Peaks
  • Structure
  • Spectrum Windows
  • assignResToDim(peakDim, resonance)
  • Assign a resonance to peak dimension
  • Checks
  • Any atoms are of a valid molecule
  • Isotopes match dimension
  • Shift is within tolerances
  • Whether the aliasing is changed
  • Creates
  • Covalent links between resonances
  • Peak annotation label
  • Updated chemical shift value
  • Data Model objects

57
CcpNmr Graphical Widgets
  • A library for any developer to use

ColorList PulldownMenu ScrolledMatrix LabelFrame C
heckButton Button Label Entry ButtonList
58
Aftercare
  • www.ccpn.ac.uk
  • Downloads
  • Data Model documentation
  • Analysis documentation
  • Tutorials
  • Mailing List
  • http//www.jiscmail.ac.uk/lists/CCPNMR.html
  • Quick response
  • Bugs
  • Requests
Write a Comment
User Comments (0)
About PowerShow.com